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METHODS AND COMPOSITIONS 
FOR DETECTING HEPATITIS E VIRUS 



Related Applications 

This application claims priority to U.S.S.N. 09/173,141, filed October 15, 1998, now 
pending, which claims priority under 35 U.S.C. § 119(e) to provisional application U.S.S.N. 
60/061,199, filed October 15, 1997, now abandoned, the disclosures of which are incorporated 
5 by reference herein. 

Field of the Invention 

This invention relates generally to methods and compositions for detecting hepatitis E 
virus, and more particularly to methods and compositions for detecting in, or treating 
individuals infected with US-type and US-subtype strains of hepatitis E virus. 

10 Background of the Invention 

There are at least five major classes of hepatotropic viruses that cause inflammation of 
the liver (hepatitis). These viruses include hepatitis A virus (HAV), hepatitis B virus (HBV), 
hepatitis C virus (HCV), hepatitis D virus (HDV) and hepatitis E virus (HEV). Although only 
HBV, HCV and HDV cause chronic hepatitis, all five types cause acute disease either directly 

15 or as a result of superinfection/co-infection by, for example, HBV and HDV. HEV causes 

symptoms of hepatitis that are similar to those of other viral agents including abdominal pain, 
jaundice, malaise, anorexia, dark urine, fever, nausea and vomiting (see, for example, Reyes et 
al 9 "Molecular biology of non-A, non-B hepatitis agents: hepatitis C and hepatitis E viruses" in 
Advances in Virus Research (1991) 40: 57-102; Bradley, "Hepatitis non-A, non-B viruses 

20 become identified as hepatitis C and E viruses" in Progr. Med. Virol. (1990) 37: 101-135; 
Hollinger "Non-A, non-B hepatitis viruses" in Virology, Second Edition (1990), Second 
Edition, Raven Press, New York pp. 2239-2271 ; Gust et al , "Report of a workshop: 
waterborne non-A, non-B hepatitis" J. Infect. Dis. (1987) 156: 630-635; and Krawcyznski 
"Hepatitis E" Hepatology (1993) 17: 932-941). Unlike the other hepatoviruses, however, HEV 

25 generally has not been perceived as being a significant cause of hepatitis in the US. 
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Geographic regions where HEV is endemic include eastern and northern Africa, India, 
Pakistan, Burma and China (Reyes et ah (1991) supra). The case fatality rate of HEV infection 
is estimated to be between about 0.1% to about 1.0% in the general population, where HEV is 
endemic, and as high as about 20% among pregnant women in developing countries. Most 
5 fatalities result from fulminant hepatitis (Reyes et ah (1991) supra). The occasional reports of 
infection with HEV in the US, western Europe and Japan, usually are observed in travelers 
returning home from visits to areas where HEV in endemic. However, there is little 
information pertaining to the morbidity and/or mortality of infection with HEV in the US since 
HEV infections are not reported to a central agency. Extensive, systematic studies have not 
10 been performed to determine the importance of HEV in US. Further, if such studies were 
performed, the relative importance of HEV in US (and possibly Japan and Western Europe) 
may continue to be underestimated unless the proper reagents are developed to conduct such a 
study. 

The basic features of HEV is that it is a non-enveloped virus, approximately 27-30 nm 
15 in diameter possessing a positive sense, single stranded RNA genome which comprises three 
discontinuous open-reading frames (ORFs), referred to in the art as open reading frame 1 (ORF 
1), open reading frame 2 (ORF 2), and open reading frame 3 (ORF 3). Based on the overall 
morphology of the virus and the size and organization of the genome, the virus is tentatively 
classified as a member of the Caliciviridae. The first two isolates of HEV to be identified and 
20 sequenced were obtained from Burma and from Mexico. The overall nucleic acid identity 

across the genome of both isolates is 76% (Reyes et ah (1990) Science^ 247 : 1335-1339; Tarn et 
ah (1991) Virology 185: 120-131; Huang et ah (1992) Virology 191: 550-558). Many ofthe 
nucleotide differences were noted at the third codon position, such that the deduced similarities 
in amino acid sequences between the Burmese and Mexican strains of HEV were 83%, 93% 
25 and 87%, for open reading frames ORF 1, ORF 2, and ORF 3, respectively. 

In the Burmese strain, there is a short non-translated region of about 27 nucleotides at 
the 5' end of the genome which has not been identified in the Mexican strain. ORF 1 
comprises approximately 5,100 nucleotides, which encode several conserved motifs including a 
putative methyltransferase domain, an RNA helicase domain, a putative RNA-dependent RNA 
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polymerase (RDRP) domain, and a putative papain-like protease. A tripeptide sequence of Gly- 
Asp-Asp (GDD), found in all positive-sense RNA plant and animal viruses, is located within 
ORF 1 and usually signifies RDRP function. Conserved motifs suggestive of purine NTPases 
activity that is usually associated with cellular and viral helicases also are present in the ORF 1 
5 sequence. There is no consistent immune response to gene products encoded in ORF 1 . 

The second open reading frame (ORF 2) occupies the carboxyl one-third of the viral 
genome. ORF 2 comprises approximately 2,000 nucleotides which encode a consensus signal 
peptide sequence at the amino terminus of ORF 2, and a putative capsid protein, translated in a 
1+ reading frame in relation to ORF 1. Frequently, HEV infected individuals produce 
10 antibodies that react with peptides or recombinant proteins derived from ORF 2. 

The third open reading frame (ORF 3) partly overlaps both ORF 1 and ORF 2, and 
comprises 369 nucleotides translated in the +2 reading frame in relation to ORF 1 . Although 
the function of the protein encoded by ORF 3 is unknown, the protein is antigenic, with most 
HEV infected individuals producing antibodies to this protein. Accordingly, peptides or 
15 recombinant proteins derived from ORF 2 and ORF 3 may serve as serologic markers useful in 
diagnosing exposure to HEV. 

Recently, several additional HEV isolates have been identified and compared to the 
Burmese and Mexican strains of HEV. Most of the recent isolates are more closely related to 
the Burmese strain than to the Mexican strain of HEV. Except for a brief appearance in 1986- 
20 1987, there have been no additional isolates of the Mexican strain of HEV (Velasquez et al 
(1992) JAMA, 263: 3281-3286). 

One isolate, referred to as SAR-55, recently was isolated from an HEV-infected 
individual from Pakistan. The SAR-55 isolate is highly related to the Burmese strain with 
nucleotide and amino acid identities of 94% and 99%, respectively, across the entire genome. 
25 Several other recent isolates have been made from the Chinese province of Xuar, bordering on 
Pakistan. These Chinese isolates were more closely related to the Pakistani strain 
(approximately 98% nucleotide identity) than to the Burmese strain (approximately 93% 
nucleotide identity). 
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Prior to the sequencing of the viral genome and the availability of viral-encoded 
recombinant proteins and synthetic peptides, HEV infection was monitored by electron 
microscopy and immunofluoresence. Soon after the identification of the HEV genome, specific 
laboratory techniques for detecting HEV infection became available including (i) specific 
5 immunoassays, for example, western blot assays and ELISA's based on recombinant proteins 
and/or synthetic peptides, and (ii) polymerase chain reactions (PCR), for example, reverse 
transcriptase PCR (RT-PCR). RT-PCR has been used successfully to detect HEV RNA in 
samples of stool or serum in cases of acute hepatitis infections, and in epidemics of ET- 
NANBH. Furthermore, by using recombinant antigens derived from the Mexican and Burmese 

10 strains of HEV, specific IgG, IgM and, in some cases, IgA antibodies to HEV have been 
detected in specimens obtained from ET-NANBH outbreaks in Somalia, Burma, Borneo, 
Tashkent, Kenya, Pakistan and Mexico. Specific IgG, and sometimes IgM antibodies to HEV 
have been detected in cases of acute, sporadic hepatitis in geographic regions such as Egypt, 
India, Tajikistan and Uzbekistan as well as in acute hepatitis cases among patients in 

15 industrialized nations (for example, US, UK, Netherlands and Japan) who traveled to areas 
endemic for HEV. 

To date, PCR and immunoassay-based tests based on the Burmese and Mexican isolates 
of HEV have established that various cases of "waterborne hepatitis" were caused by HEV. 
The antibody tests also were important in establishing HEV as a cause of acute, sporadic 

20 hepatitis in developing nations and among travelers to regions where HEV is endemic. 

However, it is unclear as to how many cases of acute HEV currently go undiagnosed due to the 
inability of current reagents to detect exposure to all strains of HEV. Accordingly, as new 
isolates of HEV are identified, it is desirable to develop new compositions and methods for 
detecting and/or treating hepatitis caused by the new HEV strains, which heretofore remain 

25 undetectable by the currently available test kits. 

Summary of the Invention 



The invention is based, in part, upon the discovery of a new family of human hepatitis E 
viruses. The newly discovered family of hepatitis E viruses fall within a class referred to 
hereinafter as a US-type hepatitis E virus. Furthermore, two members of the family were 
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discovered in individuals living in the United States and exhibit considerable similarities when 
compared at the nucleotide and amino acid levels. The latter two members together belong to a 
subclass of the US-type hepatitis E virus, referred to hereinafter as US-subtype hepatitis E 
virus. 

5 Accordingly, in one aspect, the invention provides a method for detecting the presence 

of a US-type or US-subtype hepatitis E virus in a test sample of interest. The method 
comprises the steps of (a) contacting the test sample with a binding partner that binds 
specifically to a marker (or target) for the virus, which if present in the sample binds to the 
binding partner to produce a marker-binding partner complex, and (b) detecting the presence or 

10 absence of the complex. The presence of the complex is indicative of the presence of the virus 
in the test sample. 

In one embodiment, the marker is an anti-US-type or anti-US-subtype antibody, for 
example, an immunoglobulin G (IgG) or an immunoglobulin M (IgM) molecule, present in the 
sample of interest, and the binding partner is an isolated polypeptide chain defining an epitope 

15 that binds specifically to the marker. In such a case, it is contemplated that the test sample is a 
body fluid sample, for example, blood, serum or plasma, harvested from an individual under 
investigation. In a preferred embodiment, the polypeptide chain defining a US-type or US- 
subtype specific epitope is immobilized on a solid support. Thereafter, the immobilized 
polypeptide chain is combined with the sample under conditions that permit the marker 

20 antibody, for example, an anti-US-type or anti-US-subtype hepatitis E virus specific antibody, 
present in the sample to bind to the immobilized polypeptide. Thereafter, the presence or 
absence of bound antibody can be detected using, for example, a second antibody or an antigen 
binding fragment thereof, for example, an anti-human antibody or an antigen binding fragment 
thereof, labeled with a detectable moiety. 

25 It is contemplated that many different US-type and US-subtype specific polypeptides 

may be useful as a binding partner in the practice of this embodiment of the invention. For 
example, in one preferred embodiment of the invention, it is contemplated that the binding 
partner may be at least a portion, for example, at least 5, preferably at least 8, more preferably 
at least 15 and even more preferably at least about 25 amino acid residues, of a polypeptide 
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chain selected from the group consisting of SEQ ID NOS:91, 92 and 93, including naturally 
occurring variants thereof, and which represent a unique amino acid sequence when compared 
to the corresponding amino acid sequences of members of the Burmese and Mexican families. 
Similarly, it is contemplated that the binding partner may be a polypeptide chain comprising the 
5 amino acid sequence set forth in SEQ ID NOS:173, 174, or 175. In another preferred 
embodiment of the invention, it is contemplated that the binding partner may be at least a 
portion, for example, at least 5, preferably at least 8, more preferably at least 15 and even more 
preferably at least about 25 amino acid residues, of a polypeptide chain selected from the group 
consisting of SEQ ID NOS:166, 167 and 168, including naturally occurring variants thereof, 
10 and which represent a unique amino acid sequence when compared to the corresponding amino 
acid sequences of members of the Burmese and Mexican families. Similarly, it is contemplated 
that the binding partner may be a polypeptide chain comprising the amino acid sequence set 
forth in SEQ ID NOS:176, 223 or 224. 

In another embodiment of the invention, the marker is a polypeptide chain unique for a 
15 member of the US-type or US-subtype families of HEV, and the binding partner preferably is 
an isolated antibody, for example, a polyclonal or monoclonal antibody, that binds to an epitope 
on the marker polypeptide chain. The binding partner may be either labeled with a detectable 
moiety or immobilized on a solid support. For example, it is contemplated that practice of this 
embodiment of the invention may be facilitated by immobilizing on a solid support, a first 
20 antibody that binds a first epitope on the marker polypeptide of interest. A test sample to be 
analyzed then is combined with the solid support under conditions that permit the immobilized 
antibody to bind the marker polypeptide. Thereafter, the presence or absence of bound marker 
polypeptide chain may be determined using, for example, a second antibody conjugated with a 
detectable moiety which binds to a second, different epitope on the marker polypeptide chain. 

25 An antibody useful in the practice of this embodiment of the invention preferably is 

capable of binding specifically to a polypeptide chain selected from the group consisting of 
SEQ ID NOS:91, 92, and 93, including naturally occurring variants thereof, and has a higher 
binding affinity for such a polypeptide chain relative to the corresponding sequences of 
members of the Burmese and Mexican families. It is contemplated that an antibody useful in 
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the practice of the invention preferably is capable of binding specifically to a polypeptide chain 
comprising the amino acid sequence set forth in SEQ ID NOS:173 or 175. This antibody being 
further characterized as, under similar conditions, preferably having a lower affinity for, and 
most preferably failing to bind the amino acid sequence set forth in SEQ. ID NOS:169 or 171 

5 or to the regions in the Burmese and Mexican strains that correspond to SEQ ID NO: 175. 

Similarly, it is contemplated that an antibody useful in the practice of the invention preferably 
is capable of binding specifically to a polypeptide chain comprising the amino acid sequence 
set forth in SEQ ID NOS:174 or 176. This antibody being further characterized as, under 
similar conditions, preferably having a lower affinity for, and most preferably failing to bind 

10 the amino acid sequence set forth in SEQ. ID NOS:170 or 172 or to the regions in the Burmese 
and Mexican strains that correspond to SEQ ID NO: 176. 

Similarly, it is contemplated that an antibody useful in the practice of this embodiment 
of the invention preferably is capable of binding specifically to a polypeptide chain selected 
from the group consisting of SEQ ID NOS:166, 167, and 168, including naturally occurring 

15 variants thereof, and has a higher binding affinity for such a polypeptide chain relative to the 
corresponding sequences of members of the Burmese and Mexican families. It is contemplated 
that an antibody useful in the practice of the invention preferably is capable of binding 
specifically to a polypeptide chain comprising the amino acid sequence set forth in SEQ ID 
NO: 223. This antibody being further characterized as, under similar conditions, preferably 

20 having a lower affinity for, and most preferably failing to bind the amino acid sequences set 
forth in SEQ. ID NOS:170 or 172. Similarly, it is contemplated that an antibody useful in the 
practice of the invention preferably is capable of binding specifically to a polypeptide chain 
comprising the amino acid sequence set forth in SEQ ID NO:224. This antibody being further 
characterized as, under similar conditions, preferably having a lower affinity for, and most 

25 preferably failing to bind the amino acid sequence set forth in SEQ ID NOS:169 or 171. 

In another embodiment of the invention, the marker is a nucleic acid sequence defining 
at least a portion of a genome of a US-type or US-subtype E virus, or a sequence 
complementary thereto. Similarly, it is contemplated that the binding partner is an isolated 
nucleic acid sequence, for example, a deoxyribonucleic acid (DNA), ribonucleic acid (RNA) or 
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peptidyl nucleic acid (PNA) sequence, preferably comprising 8-100 nucleotides, more 
preferably comprising 10 to 75 nucleotides and mostly preferably comprising 15-50 
nucleotides, which is capable of hybridizing specifically, for example, under specific 
hybridization conditions or under specific PCR annealing conditions, to the nucleotide 
5 sequence set forth in SEQ ID NOS:89 or 164. 

Practice of this embodiment of the invention may be facilitated, for example, by 
isolating nucleic acids from the sample of interest. Thereafter, the resulting nucleic acids, may 
be fractionated by, for example, gel electrophoresis, transferred to, and immobilized onto a 
solid support, for example, nitrocellulose or nylon membrane, or alternatively may be 

10 immobilized directly onto the solid support via conventional dot blot or slot blot 

methodologies. The immobilized nucleic acid then may be probed with a preselected nucleic 
acid sequence labeled with a detectable moiety, that hybridizes specifically to the marker 
sequence. Alternatively, the presence of marker nucleic acid in a sample may be determined by 
standard amplification based methodologies, for example, polymerase chain reaction (PCR) 

15 wherein the production of a specific amplification product is indicative of the presence of 
marker nucleic acid in the sample. 

In another aspect, the invention provides isolated US-type and US-subtype specific 
polypeptides sequences. These polypeptides include those described hereinabove in the section 
pertaining to US-type and US-subtype hepatitis E specific polypeptides chains useful as 

20 binding partners. In a preferred embodiment, the isolated polypeptide chain comprises an 

amino acid sequence set forth in SEQ ID NO:93, SEQ ID NO:168, SEQ ID NO:173, SEQ ID 
NO: 174, SEQ ID NO: 175, SEQ ID NO: 176, SEQ ID NO:223 or SEQ ID NO:224. It is 
contemplated that these and other US-type and US-subtype specific polypeptide chains may be 
employed in an assay format for detecting the presence of anti-US-type of US-subtype hepatitis 

25 E specific antibodies in a sample. In addition, it is contemplated that these polypeptides may 
be used either alone or in combination with adjuvants for the production of antibodies in 
laboratory animals, or similarly, used in combination with pharmaceutically acceptable carriers 
as vaccines for either the prophylactic or therapeutic immunization of mammals. 



In another aspect, the invention provides isolated anti-US-type or anti-US-subtype 
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hepatitis E specific antibodies, which include those discussed hereinabove in the section 
pertaining to antibodies useful as binding partners. In a preferred embodiment, the isolated 
antibody is capable of binding specifically to a polypeptide chain selected from the group 
consisting of a polypeptide encoded by an ORF 1 sequence of a US-type or a US-subtype 
5 hepatitis E virus, a polypeptide encoded by an ORF 2 sequence of a US-type or a US-subtype 
hepatitis E virus, or a polypeptide encoded by an ORF 3 sequence of a US-type or a US- 
subtype hepatitis E virus. In particular, it is contemplated that useful antibodies are 
characterized in that they are capable of binding specifically to a polypeptide chain comprising 
the amino acid sequence set forth in SEQ ID NO:93, SEQ ID NO:168, SEQ ID NO:173, SEQ 

10 ID NO:174, SEQ ID NO:175, SEQ ID NO:176, SEQ ID NO:223 or SEQ ID NO:224. It is 
contemplated that these antibodies and other antibodies may be used to advantage in 
immunoassays for detecting the presence in a sample of members of the US-type or US-subtype 
hepatitis E families. The antibody may be used either in a direct immunoassay wherein the 
antibody itself preferably is labeled with a detectable moiety or in an indirect immunoassay 

15 wherein the antibody itself provides a target for a second binding partner, e.g., a second 
antibody labeled with a detectable moiety. Furthermore, it is contemplated that these 
antibodies may be used in combination with, for example, a pharmaceutically acceptable carrier 
for use in the passive, therapeutic or prophylactic immunization of a mammal. 

In another aspect, the invention provides isolated nucleic acid sequences such as those 
20 discussed in the previous section pertaining to the use of nucleic acids as a marker or a binding 
partner for detecting the presence of a US-type or US-subtype hepatitis E virus in a sample. In 
a preferred embodiment, the invention provides isolated nucleic acid sequences defining at least 
a portion of an ORF 1, ORF 2 or ORF 3 sequence of a US-type or US-subtype hepatitis E virus, 
or a sequence complementary thereto. It is contemplated that these and other nucleic acid 
25 sequences may be used, for example, as nucleotide probes and/or amplification primers for 

detecting the presence of a US-type or US-subtype hepatitis E virus in a sample of interest. In 
addition, it is contemplated the nucleic acid sequences or sequences complementary thereto 
may be combined with a pharmaceutically acceptable carrier for use in anti-sense therapy. 
Furthermore, it is contemplated the nucleic acid sequences may be integrated in vectors which 
30 may then be transformed or transfected into a host cell of interest. The host cells may then be 
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combined with a pharmaceutically acceptable carrier and used as a vaccine, for example, a 
recombinant vaccine, for immunizing a mammal, either prophylactically or therapeutically, 
against a preselected US-type or US-subtype hepatitis E virus. 

The foregoing and other objects, features and advantages of the present invention will 
5 be made more apparent from the following detailed description of preferred embodiments of the 
invention. 



Brief Description of the Drawings 

The objects and features of the invention may be better understood by reference to the 
10 drawings described below in which, 

Figure 1 is a schematic representation of a HEV genome showing the relative positions 
of the ORF 1, ORF 2, and ORF 3 regions. 

Figure 2 is a graph showing levels of serum aspartate aminotransferase (boxes) and 
serum total bilirubin (diamonds) in patient USP-1 from day 1 of a hospital admission through 
15 day 37 post admission. 

Figure 3 is a schematic representation of the HEV US-1 genome showing the relative 
positions of clones isolated during the course of this work. 

Figure 4 is a schematic representation of the HEV US-2 genome showing the relative 
positions of clones isolated during the course of this work. 

20 Figure 5 shows an unrooted phylogenetic tree depicting the relationship of nucleotide 

sequences from full length HEV US-1, HEV US-2, and 10 other HEV isolates. Branch lengths 
are proportional to the evolutionary distances between sequences. The scale representing 
nucleotide substitutions per position is shown. The internal node numbers indicate the 
bootstrap values (expressed as a percentage of all trees) obtained from 100 replicates. Isolates 

25 represented are Burmese, Bl, B2; Chinese, CI, C2, C3, C4; Pakistan, PI; Indian, II, 12; 
Mexican, Ml ; and United States, US-1, US-2. 
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Figure 6 shows an unrooted phylogenetic tree depicting the relationship of nucleotide 
sequences from the ORF 2/3 regions {i.e., sequences corresponding to nucleotide residue 
numbers 5094-71 14 of SEQ ID NO: 89). Branch lengths are proportional to the evolutionary 
distances between sequences. The scale representing nucleotide substitutions per position is 
5 shown. The internal node numbers indicate the bootstrap values (expressed as a percentage of 
all trees) obtained from 100 replicates. Isolates represented are Burmese, Bl, B2; Chinese, CI, 
C2, C3, C4; Pakistan, PI; Indian, II, 12; Mexican, Ml; Swine, SI; and United States, US-1, 
US-2. 

Figure 7 is a graph showing levels of alanine aminotransferase (boxes), serum aspartate 
10 transferase (circles), and gamma-glutamyltransferase (triangles) in a macaque before and after 
inoculation with sera harvested from patient USP-2. Also shown are times when HEV US-2 
UNA were present in serum and fecal samples, as well as times when anti-HEV US-2 IgM and 
IgG were detectable. 

Figure 8 is a schematic representation of the Itl genome showing the relative positions 
15 of clones isolated during the course of this work. 

Figures 9 shows aligments of Burmese (Bl), Mexican (Ml), Chinese (CI), Pakistan 
(PI) and US-1 showing the design of HEV consensus primers for ORF 1, ORF 2/3 and ORF 2. 
Preferred consensus primers are denoted by the highlighted boxes. 

Figure 10 shows an unrooted phylogenetic tree depicting the relationship of ORF 1 
20 nucleotide sequences 371 nucleotides in length and corresponding to residues 26-396 of SEQ 
ID NO: 89. The scale representing nucleotide substitutions per position is shown. The internal 
node numbers indicate the bootstrap values (expressed as a percentage of all trees) obtained 
from 1000 replicates. Isolates represented are Burmese, Bl, B2; Chinese, CI, C2, C3, C4; 
Pakistan, PI; Indian, II, 12; Mexican, Ml; Italian, Itl; Greek, Gl, G2; and United States, US-1, 
25 US-2. 

Figure 1 1 shows an unrooted phylogenetic tree depicting the relationship of ORF 2 
nucleotide sequences 148 nucleotides in length and corresponding to residues 6307-6454 of 
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SEQ ID NO:89. The scale representing nucleotide substitutions per position is shown. The 
internal node numbers indicate the bootstrap values (expressed as a percentage of all trees) 
obtained from 1000 replicates. Isolates represented are Burmese, Bl, B2; Chinese, CI, C2, C3, 
C4; Pakistan, PI; Indian, II, 12; Mexican, Ml; Italian, Itl; Greek, Gl, G2; Swine, SI; and 
5 United States, US-1 and US-2. 

Figure 12 shows a schematic representation of preferred HEV-US recombinant protein 
constructs. In 12A, the ORF 2 and ORF 3 structural proteins of HEV are shown with the first 
and last amino acid positions designated. The presence of immunodominant epitopes are 
indicated by lines within the ORFs. Figure 12B shows an ORF 3 region that was cloned into an 

10 expression vector, with the first and last amino acid positions designated (SEQ ID NO:203 or 
SEQ ID NO:204). Figure 12C shows an ORF 2 region that was cloned into an expression 
vector, with the first and last amino acid positions designated (SEQ ID NO: 199 or 200). Figure 
12D shows an ORF 3/2 chimeric construct cloned into an expression vector with the first and 
last amino acid positions of each component of the chimeric construct designated (SEQ ID 

15 NO:206 or 207). The sequence omitted from the ORF 3/2 construct is indicated with a dashed 
line. In Figures 12B-12D, the presence of a FLAG® peptide at the carboxyl terminus of each 
construct is indicated by a solid box. 

Figure 13 is a graph showing levels of alanine aminotransferase (square), IgG (circle) 
and IgM (star) in a macaque before and after inoculation with sera harvested from patient USP- 
20 2. 

Figure 14 shows an unrooted phylogenetic tree depicting the relationship of ORF 1 
nucleotide sequences 371 nucleotides in length and corresponding to residues 26-396 of SEQ 
ID NO: 89. The scale representing nucleotide substitutions per position is shown. The internal 
node numbers indicate the bootstrap values (expressed as a percentage of all trees) obtained 
25 from 1000 replicates. Isolates represented are Burmese, Bl, B2; Chinese, CI, C2, C3 ? C4; 
Pakistan, PI; Indian, II, 12; Mexican, Ml; Italian, Itl; Greek, Gl, G2; Austrian, Aul; 
Argentine, Arl, Ar2; and United States, US-1, US-2. 
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Figure 1 5 shows an unrooted phylogenetic tree depicting the relationship of ORF 2 
nucleotide sequences 148 nucleotides in length and corresponding to residues 6307-6454 of 
SEQ ID NO:89. The scale representing nucleotide substitutions per position is shown. The 
internal node numbers indicate the bootstrap values (expressed as a percentage of all trees) 
5 obtained from 1000 replicates. Isolates represented are Burmese, Bl, B2; Chinese, CI, C2, C3, 
C4; Pakistan, PI; Indian, II, 12; Mexican, Ml; Italian, Itl; Greek, Gl, G2; Austrian, Aul; 
Argentine, Ar2; Swine, SI; and United States, US-1 and US-2. 

Figure 16 shows an unrooted phylogenetic tree depicting the relationship of ORF 2 
nucleotide sequences 98 nucleotides in length and corresponding to residues 6354-6451 of SEQ 
10 ID NO:89. The scale representing nucleotide substitutions per position is shown. The internal 
node numbers indicate the bootstrap values (expressed as a percentage of all trees) obtained 
from 1000 replicates. Isolates represented are Burmese, Bl, B2; Chinese, CI, C2, C3, C4; 
Pakistan, PI; Indian, II, 12; Mexican, Ml; Italian, Itl; Greek, Gl, G2; Austrian, Aul; 
Argentine, Arl, Ar2; Swine, SI; and United States, US-1 and US-2. 

15 Detailed Description of the Invention 

As mentioned above, this invention is based, in part, upon the discovery of a new family 
of human hepatitis E viruses. The newly discovered family of hepatitis E viruses fall within a 
class referred to hereinafter as a US-type hepatitis E virus. Furthermore, as mentioned above, 
two members of the US-type family were identified in sera obtained from two individuals 

20 living in the United States of America. These two members together belong to a subclass of the 
US-type hepatitis E virus, referred to hereinafter as a US-subtype hepatitis E virus. The 
discovery of the US-type and US-subtype hepatitis E viruses enables the development of 
methods and compositions for detecting the presence of a US-type of US-subtype hepatitis E 
virus in individuals who heretofore have not been diagnosed as suffering from hepatitis based 

25 on commercially available hepatitis detection kits, as well as methods and compositions for 
immunizing an individual against such a virus. 

In one aspect, the invention pertains to a method of detecting the presence of a US-type 
or US-subtype hepatitis E virus in a test sample. The method comprises the steps of (a) 



14 



contacting the sample with a binding partner that binds specifically to a marker for such a virus, 
which if present in the sample binds to the binding partner to produce a marker-binding protein 
complex, and (b) detecting the presence or absence of the complex. The presence of the 
complex is indicative of the presence of the virus in the sample. Based on the discovery of the 
5 US-type and US-subtype hepatitis E virus disclosed herein, it will be apparent that a variety of 
assays, for example, protein- or nucleic acid-based assays, may be produced for detecting the 
presence of the virus in a sample. Protein-based assays may include, for example, conventional 
immunoassays, and nucleic acid-based assays may include, for example, conventional probe 
hybridization or nucleic acid sequence amplification assays, all of which are well known and 
10 thoroughly discussed in the art. 

In another aspect, the invention provides reagents, for example, antibodies, epitope 
containing polypeptide chains, and nucleotide sequences that may be used to develop vaccines 
for immunizing, either prophylactically or therapeutically, an individual against a US-type or 
US-subtype hepatitis E virus. 

15 L Definitions 

So that the invention may be more readily understood, certain terms as used herein are 
defined hereinbelow. 

As used herein, the term "US-type" hepatitis E virus is understood to mean any human 
virus {i.e., capable of infecting a human) that is serologically distinct from hepatitis A virus 
20 (HAV), hepatitis B virus (HBV), hepatitis C virus (HCV), hepatitis D virus (HDV) and 

hepatitis G virus (HGV) and comprising a single stranded RNA genome defining at least one 
open reading frame and having a nucleotide sequence greater than 79.7% identity to the 
nucleotide sequence defined by residues 6307-6454 of SEQ ID NO:89. 

As used herein, the term "US-subtype" hepatitis E is understood to mean any human 
25 virus (le., capable of infecting a human) that is serologically distinct from hepatitis A virus 
(HAV), hepatitis B virus (HBV), hepatitis C virus (HCV), hepatitis D virus (HDV) and 
hepatitis G virus (HGV) and comprising a single stranded RNA genome defining at least one 
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open read frame and having a nucleotide sequence greater than 90.5% identity to the nucleotide 
sequence defined by residues 6307-6454 of SEQ ID NO:89. 

As used herein, the term, "test sample" is understood to mean any sample, for 
example, a biological sample, which contains the marker (for example, an antibody, antigenic 
5 protein or peptide, or nucleotide sequence) to be tested. Preferred test samples include tissue or 
body fluid samples isolatable from an individual under investigation. Preferred body fluid 
samples include, for example, blood, serum, plasma, saliva, sputum, semen, urine, feces, bile, 
spinal fluid, breast exude, ascities, and peritoneal fluid. Another preferred test sample is a cell 
line and more preferably, a mammalian cell line. A most preferred cell line is a human fetal 
10 kidney cell line. 

As used herein, the term "open reading frame" or "ORF" is understood to mean a region 
of a polynucleotide sequence capable of encoding one or more polypeptide chains. The region 
may represent an entire coding sequence, i.e., beginning with an initiation codon (e.g., ATG 
(AUG)) and ending at a termination codon (e.g., TAA (UAA), TAG (UAG), or TGA (UGA)), 
15 or a portion thereof. 

As used herein, the term "polypeptide chain" is understood to mean any molecular chain 
of amino acids and does not refer to a specific length of the product. Thus, peptides, 
oligopeptides, and proteins are included within the definition of polypeptide chain. 

As used herein, the term "epitope", as used synonymously with "antigenic determinant", 
20 is understood to mean at least a portion of an antigen capable of being specifically bound (i.e., 
bound with an affinity greater than about 10 5 M" 1 , and more preferably with an affinity greater 
than about 10 7 M" 1 ) by an antibody variable region. Conceivably, an epitope may comprise 
three amino acids in a spatial conformation unique to the epitope. Generally, an epitope 
comprises at least five amino acids, and more usually, at least eight to ten amino acids. 
25 Methods of examining spatial conformation are known in the art and include, for example, x- 
ray crystallography and two-dimensional nuclear magnetic resonance. 
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A polypeptide is "immunologically reactive" with an antibody when it binds to an 
antibody due to antibody recognition of a specific epitope defined by the polypeptide chain. 
Immunological reactivity may be determined by antibody binding, more particularly by the 
kinetics of antibody binding, and/or by a competitive binding study. If a preselected antibody 
5 is immunologically reactive with a first antigen but is not immunologically reactive or is less 
immunologically reactive with a second, different antigen, then the two antigens are considered 
to be serologically distinct. As used herein, the term "affinity" is understood to mean a 
measure of reversible interaction between two molecules (for example, between an antibody 
and an antigen). The higher the affinity, the stronger the interaction between the two 
10 molecules. 

As used herein, the term "detectable moiety" is understood to mean any signal 
generating compound, for example, chromogen, a catalyst such as an enzyme, a luminescent 
compound such as dioxetane, acridinium, phenanthridinium and luminol, a radioactive element, 
and a visually detectable label. Examples of enzymes include alkaline phosphatase, horseradish 
15 peroxidase, beta-galactosidase, and the like. Although the selection of a particular detectable 
moiety is not critical, the detectable moiety will be capable of producing a signal either by itself 
or in conjunction with one or more additional substances. 

As used herein, the term "solid support" is understood to mean any plastic, derivatized 
plastic, magnetic or non-magnetic metal, glass or silicon surface. Useful surfaces include, for 

20 example, the surface of a test tube, microtiter well, sheet, bead, microparticle, chip, sheep (or 
other suitable animal's) red blood cell, or duracyte. Suitable solid supports are not critical to 
the practice of the invention and can be selected by one skilled in the art. Suitable methods for 
immobilizing peptides on solid phases include ionic, hydrophobic, covalent interactions and the 
like. The solid support can be chosen for its intrinsic ability to attract and immobilize the 

25 capture reagent. Alternatively, the solid support can retain an additional receptor which has the 
ability to attract and immobilize the capture reagent. 

It is contemplated that the solid support also may comprise any suitable porous material 
with sufficient porosity to allow access by detection antibodies and a suitable surface affinity to 
bind antigens. Microporous structures generally are preferred, but materials with gel structure 



in the hydrated state may be used as well. All of these materials may be used in suitable 
shapes, such as films, sheets, or plates, or they may be coated onto or bonded or laminated to 
appropriate inert carriers, such as paper, glass, plastic films, or fabrics. 

Other embodiments which utilize various other solid supports also are contemplated and 
5 are within the scope of this invention. For example, ion capture procedures for immobilizing 
an immobilizable reaction complex with a negatively charged polymer, described in EP 
Publication No. 0 326 100 and EP Publication No. 0 406 473, can be employed according to the 
present invention to effect a fast solution-phase immunochemical reaction. An immobilizable 
immune complex is separated from the rest of the reaction mixture by ionic interactions 
10 between the negatively charged poly-anion/immune complex and the previously treated, 
positively charged porous matrix and detected by using various signal generating systems 
previously described, including those described in chemiluminescent signal measurements as 
described in EP Publication No. 0 273 115. 

Also, the methods of the present invention can be adapted for use in systems which 
15 utilize microparticle technology including automated and semi-automated systems wherein the 
solid phase comprises a microparticle (magnetic or non-magnetic). Such systems include those 
described in U.S. Patent Nos. 5,089,424 and 5244,630, issued February 18, 1992 and 
September 14, 1993, respectively. 

The use of scanning probe microscopy (SPM) for immunoassays also is a technology to 
20 which the monoclonal antibodies of the present invention are easily adaptable. In scanning 
probe microscopy, in particular in atomic force microscopy, the capture phase, for example, at 
least one of the monoclonal antibodies of the invention, is adhered to a solid phase and a 
scanning probe microscope is utilized to detect antigen/antibody complexes which may be 
present on the surface of the solid phase. The use of scanning tunneling microscopy eliminates 
25 the need for labels which normally must be utilized in many immunoassay systems to detect 
antigen/antibody complexes. The use of SPM to monitor specific binding reactions can occur 
in many ways. In one embodiment, one member of a specific binding partner (analyte specific 
substance which is the monoclonal antibody of the invention) is attached to a surface suitable 
for scanning. The attachment of the analyte specific substance may be by adsorption to a test 
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piece which comprises a solid phase of a plastic or metal surface, following methods known to 
those of ordinary skill in the art. Or, covalent attachment of a specific binding partner (analyte 
specific substance) to a test piece which test piece comprises a solid phase of derivatized 
plastic, metal, silicon, or glass may be utilized. Covalent attachment methods are known to 
5 those skilled in the art and include a variety of means to irreversibly link specific binding 

partners to the test piece. If the test piece is silicon or glass, the surface must be activated prior 
to attaching the specific binding partner. Also, polyelectrolyte interactions may be used to 
immobilize a specific binding partner on a surface of a test piece by using techniques and 
chemistries described in EP Publication No. 0 322 100 and EP Publication No. 0 406 473. The 
10 preferred method of attachment is by covalent attachment. Following attachment of a specific 
binding member, the surface may be further treated with materials such as serum, proteins, or 
other blocking agents to minimize non-specific binding. The surface also may be scanned 
either at the site of manufacture or point of use to verify its suitability for assay purposes. The 
scanning process is not anticipated to alter the specific binding properties of the test piece. 

15 As used herein, the terms "nucleotide sequence" or "nucleic acid sequence" is 

understood to mean any polymeric form of nucleotides of any length, either ribonucleotides or 
deoxyribonucleotides. The term refers to the primary structure of the molecule. Thus, the term 
includes double- and single-stranded DNA, as well as double- and single-stranded RNA. It also 
includes modifications, for example, by methylation and/or by capping, and unmodified forms 

20 of the polynucleotide. 

As used herein, the term "primer" is understood to mean a specific oligonucleotide 
sequence complementary to a target nucleotide sequence which is capable of hybridizing to the 
target nucleotide sequence and serving as an initiation point for nucleotide polymerization 
catalyzed by DNA polymerase, RNA polymerase or reverse transcriptase. 

25 When referring to a nucleic acid fragment, such a fragment is considered to "specifically 

hybridize" or to "specifically bind" to an HEV US-type or US-subtype polynucleotide or 
variants thereof, if, within the linear range of detection, the hybridization results in a stronger 
signal relative to the signal that would result from hybridization to an equal amount of a 
polynucleotide from other than an HEV US-type, US- subtype or variant thereof. A signal 



19 

which is "stronger" than another is one which is measurable over the other by the particular 
method of detection. 



Also, when referring to a nucleic acid fragment, such a fragment is considered to 
hybridize under specific hybridization conditions if it specifically hybridizes under (i) typical 
5 hybridization and wash conditions, such as those described, for example, in Maniatis, (1st 
Edition, pages 387-389, 1982) where preferred hybridization conditions are those of lesser 
stringency and more preferred, higher stringency; or (ii) standard PCR conditions (Saiki, R.K. 
et al) or "touch-down" PCR conditions (Roux, K.H., (1994), Biotechiques, 16:812-814). 

10 As used herein, the term "probe" is understood to mean any nucleotide or nucleotide 

analog {e.g., PNA) containing a sequence which can be used to identify specific DNA or RNA 
present in samples bearing the complementary sequence. 

As used herein, the term "PNA" is used to mean peptide nucleic acid analog which may 
be utilized in a procedure such as an assay described herein to determine the presence of a 

15 target. "MA" denotes a "morpholino analog" which may be utilized in a procedure such as an 
assay described herein to determine the presence of a target. See, for example, U.S. Patent No. 
5,378,841, which is incorporated herein by reference. PNAs typically are neutrally charged 
moieties which can be directed against RNA targets or DNA. PNA probes used in assays in 
place of, for example, the DNA probes of the present invention, offer advantages not achievable 

20 when DNA probes are used. These advantages include manufacturability, large scale labeling, 
reproducibility, stability, insensitivity to changes in ionic strength and resistance to enzymatic 
degradation which is present in methods utilizing DNA or RNA. These PNAs can be labeled 
with such signal generating compounds as fluorescein, radionucleotides, chemiluminescent 
compounds, and the like. PNAs or other nucleic acid analogs such as MAs thus can be used in 

25 assay methods in place of DNA or RNA. Although assays are described herein utilizing DNA 
probes, it is within the scope of the routine that PNAs or MAs can be substituted for RNA or 
DNA with appropriate changes if and as needed in assay reagents. 



When referring to a nucleic acid fragment, such a fragment is considered to "specifically 
hybridize" or to "specifically bind" to an HEV US-type or US-subtype polynucleotide or 
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variants thereof, if, within the linear range of detection, the hybridization results in a stronger 
signal relative to the signal that would result from hybridization to an equal amount of a 
polynucleotide from other than an HEV US-type, US- subtype or variant thereof. A signal 
which is "stronger" than another is one which is measurable over the other by the particular 
5 method of detection. 

Also, when referring to a nucleic acid fragment, such a fragment is considered to 
hybridize under specific hybridization conditions if it specifically hybridizes under (i) typical 
hybridization and wash conditions, such as those described, for example, in Maniatis, (1st 
10 Edition, pages 387-389, 1982) where preferred hybridization conditions are those of lesser 

stringency and more preferred, higher stringency; or (ii) standard PCR conditions (Saiki, R.K. 
etal) or "touch-down" PCR conditions (Roux, K.H., (1994), Biotechiques, 16:812-814). 

IL Detection Methods and Reagents 

15 It is contemplated that the detection methods of the invention may employ a variety of 

protein-based or nucleic acid-based assays which are described in detail below. 

It is contemplated that a reagent for the detection of virus or markers thereof may be 
either an anti-US-type and/or US-subtype hepatitis E virus antibody, a US-type and/or US- 
subtype specific polypeptide, or a nucleic acid defining at least a portion of the genome of a 
20 US-type and/or US-subtype hepatitis E virus or a nucleic acid sequence complementary thereto. 

IL (i) Protein-based Assays 

A. Marker Antibodies : It is contemplated that if the viral marker is an anti-US-type or 
anti-US-subtype specific antibody, for example, an IgG or an IgM, molecule circulating in the 
blood stream of an individual of interest, the binding partner preferably is a polypeptide 
25 defining an epitope that binds specifically to the marker. 

In a preferred protocol for detecting the presence of anti-US-type or anti-US-subtype 
hepatitis E virus antibodies in a test sample, the protocol preferably comprises the following 
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steps which include: (a) providing an antigen comprising an immunologically reactive US-type 
or US-subtype specific polypeptide chain comprising at least 5, more preferably at least 8, even 
more preferably at least 15, and most preferably at least 25 contiguous amino acid residues and 
bindable by the antibody; (b) incubating the antigen with the test sample under conditions that 
5 permit formation of an antibody-antigen complex; and (c) detecting the presence of the 
complex. 

It is contemplated that many, different US-type or US-subtype specific polypeptides 
may be useful as a binding partner for the detection of anti-US-type or anti-US-subtype 
antibodies. For example, it is contemplated that the polypeptide chain may be an amino acid 

10 sequence defined by SEQ ID NOS:91, 92 or 93 or an immunologically reactive fragment 
thereof containing, preferably at least 5, more preferably at least 8, even more preferably at 
least 15, and most preferably at least about 25 contiguous amino acid residues, of the 
polypeptide chain set forth in SEQ ID NOS:91, 92, or 93, and which represent a unique amino 
acid sequence when compared to the corresponding amino acid sequences of members of the 

15 Burmese and Mexican families. The Burmese family i.e., "Burmese-like" strains, as used 

herein, presently comprises strains referred to herein as Bl, B2, II, 12, CI, C2, C3, C4 and PI 
and the Mexican family presently comprises strain Ml. 

It is contemplated that the binding partner may be a polypeptide selected from the group 
consisting of polypeptides defined by SEQ ID NOS:91, 92, and 93, including naturally 

20 occurring variants thereof. As used herein the term "naturally occurring variants thereof 5 with 
respect to the polypeptide defined by SEQ ID NO:91 is understood to mean any amino acid 
sequence that is at least 84%, preferably at least 86%, more preferably at least 89% and even 
more preferably at least 95% identical to residues 1 through 1698 of SEQ ID NO:91 . As used 
herein the term "naturally occurring variants thereof with respect to the polypeptide defined by 

25 SEQ ID NO:92 is understood to mean any amino acid sequence that is at least 93%, preferably 
at least 95%, and even more preferably at least 98% identical to residues 1 through 660 of SEQ 
ID NO:92. As used herein the term "naturally occurring variants thereof with respect to the 
polypeptide defined by SEQ ID NO: 93 is understood to mean any amino acid sequence that is 
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at least 85.4%, preferably at least 87.4%, more preferably at least 90.4% and even more 
preferably at least 95% identical to residues 1 through 122 of SEQ ID NO:93. 

Furthermore, it is contemplated that the binding partner may be a polypeptide encoded 
by a portion of an ORF 1 sequence. Proteins encoded by the ORF 1 sequence include, for 
5 example, a methyltransferase protein, a protease, a Y domain protein, an X domain protein, a 
helicase protein, a hypervariable region protein, and an RNA-dependent RNA polymerase 
protein. Accordingly, it is contemplated that a useful methyltransferase protein preferably has 
at least 92.3%, more preferably has at least 94.3%, and most preferably has at least 97.3% 
identity to residues 1-231 of SEQ ID NO:91. Also, it is contemplated that a useful protease 

10 protein preferably has at least 70.3%, more preferably has at least 72.3%, and most preferably 
has at least 75.3% identity to residues 424-697 of SEQ ID NO:91 . Also, it is contemplated that 
a useful Y domain protein preferably has at least 94.6%, more preferably has at least 96.6% and 
most preferably has at least 99.6% identity to residues 207-424 of SEQ ID NO:91 . Also it is 
contemplated that a useful X domain protein preferably has at least 83.4%, more preferably has 

15 at least 85.4% and most preferably has at least 88.4% identity to residues 789-947 of SEQ ID 
NO:9L Also, it is contemplated that a useful helicase protein has at least 92%, more 
preferably has at least 94% and most preferably at least 93% identity to residues 965-1 197 of 
SEQ ID NO:91 . Also, it is contemplated that a useful hypervariable region protein has at least 
28.7%, more preferably has at least 30.7%, and most preferably has at least 33.7% identity to 

20 the residues 698-788 of SEQ ID NO:91 . Also, it is contemplated that a useful RNA-dependent 
RNA polymerase has at least 88.8%, more preferably has at least 90.8%, and most preferably 
has at least about 93.8% identity to residues 1212-1698 of SEQ ID NO:91. 

Furthermore, it is contemplated that the binding partner may be a polypeptide chain 
having an amino acid sequence defined by SEQ ID NOS:166, 167 or 168, or an 
25 immunologically reactive fragment thereof containing 5, preferably at least 8, more preferably 
at least 15 and most preferably at least 25 contiguous amino acid residues of the polypeptide 
chain set forth in SEQ ID NOS:166, 167 or 168, and which represent a unique amino acid 
sequence when compared to the corresponding amino acid sequences of members of the 
Burmese and Mexican families. Similarly, it is contemplated that the binding partner may be a 
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polypeptide selected from the group consisting of SEQ ID NOS:l 66, 167 and 168, including 
naturally occurring variants thereof. As used herein, the term "naturally occurring variants 
thereof with respect to the polypeptide defined by SEQ ID NO: 166 is understood to mean any 
amino acid sequence that is at least 83.9%, preferably at least 85.9%, more preferably at least 

5 88.9%, and most preferably at least 95% identical to residues 1 through 1708 of SEQ ID 
NO: 166. As used herein, the term "naturally occurring variants thereof with respect to the 
polypeptide defined by SEQ ID NO: 167 is understood to mean any amino acid sequence that is 
at least 93%, preferably at least 95%, and most preferably at least 98% identical to residues 1 
through 660 of SEQ ID NO: 167. As used herein, the term "naturally occurring variants 

10 thereof with respect to the polypeptide defined by SEQ ID NO:168 is understood to mean any 
amino acid sequence that is at least 85.4%, preferably at least 87.4%, more preferably at least 
90.4%, and even more preferably at least 95% identical to residues 1 through 122 of SEQ ID 
NO:168. 

Furthermore, it is contemplated that the binding partner may be a polypeptide encoded 

15 by a portion of the HEV US-2 ORF 1, including, for example, a methyltransferase protein, a 
protease, a Y domain protein, an X domain protein, a helicase protein, a hypervariable region 
protein and an RNA-dependent RNA polymerase protein, or a variant thereof. Accordingly, it 
is contemplated that a useful methyltransferase protein preferably has at least 92.7%, more 
preferably has at least 94.7%, and most preferably has at least 97.7% identity to residues 1-240 

20 of SEQ ID NO: 1 66. Also, it is contemplated that a useful protease protein preferably has at 
least 69.6%, more preferably has at least 71.6%, and most preferably has at least 74.6% 
identity to residues 433-706 of SEQ ID NO: 166. Also, it is contemplated that a useful Y 
domain protein preferably has at least 94.6%, more preferably has at least 96.6%, and most 
preferably has at least 99.6% identity to residues 216-433 of SEQ ID NO:166. Also it is 

25 contemplated that a useful X domain protein preferably has at least 82.8%, more preferably has 
at least 84.8%, and most preferably has at least 87.8% identity to residues 799-957 of SEQ ID 
NO: 166. Also, it is contemplated that a useful helicase protein has at least 92.8%, more 
preferably has at least 94.8%, and most preferably has at least 97.8% identity to residues 975- 
1207 of SEQ ID NO: 166. Also, it is contemplated that a useful hypervariable region protein 

30 has at least 27%, more preferably has at least 29%, and most preferably has at least 3 1 % 
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identity to the residues 707-798 of SEQ ID NO: 166. Also, it is contemplated that a useful 
RNA-dependent RNA polymerase has at least 88.7%, more preferably has at least 90.7%, and 
most preferably has at least 93.7% identity to residues 1222-1708 of SEQ ID NO: 166. 

With regard to the identification of US-type or US-subtype specific epitopes, it is 
5 contemplated that one skilled in the art in possession of nucleic acid sequences defining and/or 
amino acid sequences encoded by at least a portion of the genome of a US-type or US-subtype 
hepatitis E virus can map potential epitope sites using conventional technologies well known 
and thoroughly discussed in the art. In addition to the use of commercially available software 
packages which identify potential epitope sites in a given sequence, it is possible to identify 

10 potential epitopes by comparison of amino acid sequences encoded by such a genome with 

sequences encoded by the genomes of other strains of HEV whose antigenic sites have already 
been elucidated. See, for example, U.S. Patent Nos: 5,686,239, 5,741,490 and 5,770,689. 
Epitopes currently identified are shown in Figure 1, and include epitopes referred to in the art as 
8-5 (SEQ ID NOS:93 AND 168), 4-2 (position 90-122 of SEQ ID NOS:93 and 168), SG3 

15 (SEQ ID NOS:175 AND 176), 3-2 (position 613-654 of SEQ ID NOS:92 and 167) and 3-2e 
(position 613-660 of SEQ ID NOS:92 and 167). A method for calculating antigenic index is 
described by Jameson and Wolf (CABIOS, 4(1), 181-186 [1988]). 

For example, two epitopes of interest are discussed in detail below and are referred to as 
3-2e and 4-2 which are encoded by portions of ORF 2 and ORF 3 of the hepatitis E genome, 

20 respectively. These epitopes were identified in the Burmese strains of HEV (referred to below 
as B 3-2e (SEQ ID NO:172) and B 4-2 (SEQ IS NO: 171)), and in the Mexican strain of HEV 
(referred to below as M 3-2e (SEQ ID NO:170) and M 4-2 (SEQ ID NO: 169)). Similar 
epitopes were identified in HEV US-1 based on amino acid sequence comparisons, and are 
referred to below as U3-2e (SEQ ID NO:174) and U4-2 (SEQ ID NO: 173). Similar epitopes 

25 were identified in HEV US-2, also based on amino acid sequence comparisons, and are referred 
to below as US-2 3-2e (SEQ ID NO:223) and US-2 4-2 (SEQ ID NO:224). 



In addition, potential epitopes may be identified using screening procedures well known 
and thoroughly documented in the art. For example, based on the nucleic acid sequences 
defining either the entire or portions of the HEV US-1 or the HEV US-2 genome, it is possible 



to generate an expression library, which, after expression can be screened to identify epitopes. 
For example, nucleic acid fragments representative of the HEV US-1 or the HEV US-2 genome 
can be cloned into the lambda-gtl 1 expression vector to produce a lambda-gtl 1 library, for 
example, a cDNA library. The library then is screened for encoded epitopes that can bind 
5 specifically with sera derived from individuals identified as being infected with HEV US-1 or 
HEV US-2. See, for example, Glover (1985) in "DNA Cloning Techniques, A Practical 
Approach", IRL Press, pp. 49-78. Typically, about 1 0 6 - 1 0 7 phage are screened, from which 
positive phage are identified, purified, and then tested for specificity of binding to sera from 
different individuals previously infected with HEV US-1 or HEV US-2. Phage which bind 
10 selectively to antibodies present in sera or plasma from the individual are selected for additional 
characterization. Once identified, an amino acid sequence of interest may be produced in large 
scale either by use of conventional recombinant DNA methodologies or by conventional 
peptide synthesis methodologies, well known and thoroughly documented in the art. 

b. Marker Polypeptides : It is contemplated that if the marker is a US-type or US- 
15 subtype virus or a specific polypeptide thereof, the binding partner useful in the practice of the 
invention preferably is an antibody, for example, a polyclonal or monoclonal antibody, that 
binds to an epitope on the virus or marker polypeptide. The binding partner may be either 
labeled with a detectable moiety or immobilized on a solid support. In particular, the 
antibodies useful in the practice of this embodiment preferably are capable of binding 
20 specifically to a US-type or US-subtype specific polypeptide chain preferably at least 5, more 
preferably at least 8, even more preferably at least 15, and most preferably at least 25 
contiguous amino acid residues in length which is unique with respect to the corresponding 
amino acid sequence found in members of the Burmese and Mexican families. 

An antibody useful in the practice of this embodiment of the invention preferably is 
25 capable of binding specifically to a polypeptide chain selected from the group consisting of 
SEQ ID NOS:91, 92, and 93, including naturally occurring variants thereof, and has a higher 
binding affinity for such a polypeptide chain relative to the corresponding sequences of 
members of the Burmese and Mexican families. It is contemplated that an antibody useful in 
the practice of the invention preferably is capable of binding specifically to a polypeptide chain 
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comprising the amino acid sequence set forth in SEQ ID NO: 173 or 175. This antibody being 
further characterized as, under similar conditions, preferably having a lower affinity for, and 
most preferably failing to bind the amino acid sequence set forth in SEQ. ID NOS:169 or 171 
or regions in the Burmese and Mexican strains that correspond to SEQ ID NO: 175. Similarly, 
5 it is contemplated that an antibody useful in the practice of the invention preferably is capable 
of binding specifically to a polypeptide chain comprising the amino acid sequence set forth in 
SEQ ID NOS:174 or 176. This antibody being further characterized as, under similar 
conditions, preferably having a lower affinity for, and most preferably failing to bind the amino 
acid sequence set forth in SEQ ID NOS:170 or 172 or regions in the Burmese and Mexican 
1 0 strains that correspond to SEQ ID NO: 1 76. 

Similarly, it is contemplated that an antibody useful in the practice of this embodiment 
of the invention preferably is capable of binding specifically to a polypeptide chain selected 
from the group consisting of SEQ ID NOS:166, 177, and 168, including naturally occurring 
variants thereof, and has a higher binding affinity for such a polypeptide chain relative to the 

15 corresponding sequences of members of the Burmese and Mexican families. It is contemplated 
that an antibody useful in the practice of the invention preferably is capable of binding 
specifically to a polypeptide chain comprising the amino acid sequence set forth in SEQ ID 
NO:223. This antibody being further characterized as, under similar conditions, preferably 
having a lower affinity for, and most preferably failing to bind the amino acid sequences set 

20 forth in SEQ. ID NOS: 170 or 1 72. Similarly, it is contemplated that an antibody useful in the 
practice of the invention preferably is capable of binding specifically to a polypeptide chain 
comprising the amino acid sequence set forth in SEQ ID NO:224. This antibody being further 
characterized as, under similar conditions, preferably having a lower affinity for, and most 
preferably failing to bind the amino acid sequence set forth in SEQ ID NOS: 169 or 171. 

25 The antibodies or antigen binding fragments thereof as described herein can be provided 

individually to detect US-type or US-subtype specific antigens. Combinations of the antibodies 
(and antigen binding fragments thereof) provided herein also may be used together as 
components in a mixture or "cocktail" of at least two antibodies, both having different binding 
specificities to separate US-type or US-subtype specific antigens. 
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c. Antibody Production: It is contemplated that one skilled in the art, in possession of 
the nucleic acid sequences defining, or amino acid sequences encoded by at least a portion of 
the ORF 1, ORF 2 and/or ORF 3 sequences of a US-type or a US-subtype hepatitis E virus may 
"be able to produce specific antibodies using techniques well known and thoroughly documented 
5 in the art. See, for example, Practical Immunology, Butt, N.R., ed. ? Marcel Dekker, NY, 1984. 
Briefly, an isolated target protein is used to raise antibodies in a xenogenic host, such as a 
mouse, pig, goat or other suitable mammal. Preferred antibodies are antibodies that bind 
specifically to an epitope on the target protein, preferably having a binding affinity greater than 
10 5 M -1 , and most preferably having a binding affinity greater than 10 7 M _i for that epitope. 

10 Typically, the target protein is combined with a suitable adjuvant capable of enhancing 
antibody production in the host, and injected into the host, for example, by intraperitoneal 
administration. Any adjuvant suitable for stimulating the host's immune response may be used 
to advantage. A commonly used adjuvant is Freund's complete adjuvant (an emulsion 
comprising killed and dried microbial cells, e.g., from Calbiochem Corp., San Diego, CA or 

15 Gibco, Grand Island, NY). Where multiple antigen injections are desired, the subsequent 
injections comprise the antigen in combination with an incomplete adjuvant (e.g., cell-free 
emulsion). 

Polyclonal antibodies may be isolated from the antibody-producing host by extracting 
serum containing antibodies to the protein of interest. Monoclonal antibodies may be produced 
20 by isolating host cells that produce the desired antibody, fusing these cells with myeloma cells 
using standard procedures known in the immunology art (See for example, Kohler and 
Milstein, Nature (1975) 256 :495), and screening for hybrid cells (hybridomas) that react 
specifically with the target protein and have the desired binding affinity. 

In addition, it is contemplated that when small peptides are used their immunogenicity 
25 may be enhanced by coupling to solid supports. For example, an epitope or antigenic region or 
fragment of a polypeptide generally is relatively small, and may comprise about 8 to 10 amino 
acids or less in length. Fragments of as few as 3 amino acids may characterize an antigenic 
region. These polypeptides may be linked to a suitable carrier molecule when the polypeptide 
of interest provided folds to provide the correct epitope but yet is too small to be antigenic. 
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Preferred linking reagents and methodologies for their use are well known in the art and 
may include, without limitation, N-succinimidyl-3-(2-pyrdylthio)propionate (SPDP) and 
succinimidyl 4-(N-maleimidomethyl)cyclohexane-l-carboxylate (SMCC). Furthermore, 
polypeptides lacking sulfhydryl groups can be modified by adding a cysteine residue. These 
5 reagents create a disulfide linkage between themselves and peptide cysteine residues on one 
protein and an amide linkage through the epsilonamino on a lysine, or other free amino group 
in the other. A variety of such disulfide/amide-forming agents are known. Other bifunctional 
coupling agents form a thioester rather than a disulfide linkage. Many of these thioether- 
forming agents are commercially available and are known to those of ordinary skill in the art. 

10 The carboxyl groups can be activated by combining them with succinimide or 1 -hydroxyl-2- 
nitro-4-sulfonic acid, sodium salt. Any carrier which does not itself induce the production of 
antibodies harmful to the host can be used. Suitable carriers include proteins, polysaccharides 
such as latex functionalized sepharose, agarose, cellulose, cellulose beads, polymeric amino 
acids such as polyglutamic acid, polylysine, and no acid copolymers and inactive virus 

15 particles, among others. Examples of protein substrates include serum albumins, keyhole 

limpet hemocyanin, immunoglobulin molecules, thyroglobulin, ovalbumin, tetanus toxoid, and 
yet other proteins known to those skilled in the art. 

In addition, it is contemplated that biosynthetically produced antibody binding domains 
wherein the amino acid sequence of the binding domain is manipulated to enhance binding 

20 affinity to a preferred epitope also may be useful in the practice of the invention. A detailed 
description of their preparation can be found, for example, in Practical Immunology, Butt, 
W.R., ed., Marcel Dekker, New York, 1984. Optionally, a monovalent antibody fragment such 
as an Fab or an Fab' fragment may be utilized. Additionally, genetically engineered 
biosynthetic antibody binding sites may be utilized which comprise either 1) non-covalently 

25 associated or disulfide bonded synthetic V H and V L dimers, 2) covalently linked V H -V L single 
chain binding sites, 3) individual V H or V L domains, or 4) single chain antibody binding sites, 
as disclosed, for example, in U.S. Patent Nos. 5,091,513 and 5,132,405. 



It is contemplated that intact antibodies (for example, monoclonal or polyclonal 
antibodies), antibody fragments or biosynthetic antibody binding sites that bind a US-type or 



US-subtype hepatitis E virus specific epitope, will be useful in diagnostic and prognostic 
applications, and also, will be useful in passive immunotherapy. 

d. Assay Formats : It is contemplated that both polypeptides which react 
immunologically with serum containing anti-US-type or anti-US-subtype hepatitis E virus 

5 specific antibodies, or antibodies raised against US-type or US-subtype hepatitis E specific 
epitopes will be useful in immunoassays to detect the presence of such a virus in a test sample 
of interest. Furthermore, it is contemplated that the presence of US-type or US-subtype 
hepatitis E virus in a sample may be detected using any of a wide range of immunoassay 
techniques, for example, direct assays, sandwich assays, and/or competition assays, currently 

10 known and thoroughly documented in the art. A variety of preferred assay formats are 
described in more detail below. 

In one preferred format, the assay employs a sandwich format. Sandwich 
immunoassays typically are highly specific and very sensitive, provided that labels with good 
limits of detection are used. A detailed review of immunological assay design, theory and 
15 protocols can be found in numerous texts in the art, including Practical Immunology, Butt, 
W.R., ed., Marcell Dekker, New York, 1984. 

In one type of sandwich format, a polypeptide (binding partner) which has been 
immobilized onto a solid support and is immunologically reactive with an anti-US-type or anti- 
US-subtype hepatitis E virus antibody (marker), is contacted with a test sample from an 

20 individual suspected of having been infected with the US-type or US-subtype hepatitis E virus, 
to form a mixture. The mixture then is incubated for a time and under conditions sufficient to 
form polypeptide/antibody complexes. Then, an indicator reagent comprising a monoclonal or 
a polyclonal antibody or a fragment thereof, which specifically binds to the test sample 
antibody, and labeled with a detectable moiety, is contacted with the antigen/antibody 

25 complexes to form a second mixture. The second mixture then is incubated for a time and 

under conditions sufficient to form antigen/antibody/antibody complexes. The presence of anti- 
US-type or anti-US-subtype hepatitis E antibody, if any, in the test sample is determined by 
detecting the presence of detectable moiety immobilized to the solid support. The amount of 
antibody present in the test sample is proportional to the signal generated. The use of biotin 
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and antibiotin, biotin and avidin, biotin and streptavidin, and the like, may be used to enhance 
the generated signal in the assay systems described herein. 

In an alternative format of the above-described assay, the immunologically reactive 
polypeptide may be immobilized "indirectly" to the solid support, i.e. through a monoclonal or 
5 polyclonal antibody or fragment thereof which specifically binds that polypeptide. 
Alternatively, in another format, the assay components may be used in the reverse 
configuration, such that an antibody or antigen binding fragment thereof, which specifically 
binds the test sample antibody, i.e., marker antibody (for example, IgG or IgM) and 
immobilized on the solid support is contacted with the test sample, for a time and under 

10 conditions sufficient to permit formation of antibody/antibody complexes. Then, an indicator 
reagent, for example, a US-type or US-subtype hepatitis E polypeptide immunologically 
reactive with captured test sample antibody and labeled with a detectable moiety, is incubated 
with the antibody/antibody complexes to form a second mixture for a time and under conditions 
sufficient to permit formation of antibody/antibody/antigen complexes. As above, the presence 

15 of antibody in the test sample, if any, that is captured by the capture antibody or antigen 
binding fragment thereof immobilized on the solid support is determined by detecting the 
measurable signal generated by the detectable moiety. 

It is contemplated that the aforementioned sandwich assays also may be used to test for 
the presence of a US-type or US-subtype hepatitis E virus, or immunologically reactive 
20 polypeptides thereof in a test sample by routine modification of the above-described assay 

configurations. It is contemplated that such modifications would be well known to one skilled 
in the art. 

In addition to the aforementioned sandwich assays, it is contemplated that competitive 
assays may also be employed in the practice of the invention. In this format, one or a 
25 combination of at least two antibodies, preferably monoclonal antibodies, which specifically 
bind to a US-type or US-subtype hepatitis E specific polypeptide chain can be employed as a 
competitive probe for the detection of antibodies to the US-type or the US-subtype specific 
protein. For example, a first HEV US-1 specific polypeptide chain such as one of the 
polypeptides disclosed herein, acting as a binding partner for the marker, is immobilized on a 
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solid support. A test sample suspected of containing antibody to HEV US-1 antigen then is 
incubated with the solid support together with an indicator reagent comprising, for example, an 
isolated anti-US-type or anti-US-subtype antibody that binds the immobilized HEV US-1 
specific polypeptide chain and labeled with a detectable moiety, for a time and under conditions 

5 sufficient to form antigen/antibody complexes immobilized to the solid support. If the marker 
antibody is present in the test sample, then the marker antibody competes with the labeled 
indicator reagent for binding the immobilized polypeptide. As the amount of marker antibody 
present in the test sample increases, the amount of labeled indicator reagent that binds the 
immobilized polypeptide decreases. A reduction in the amount of indicator reagent bound to 

10 the solid phase can be quantitated. A measurable reduction in signal compared to the signal 
generated from a confirmed negative non-A, non-B, non-C, non-D, non-E hepatitis test sample 
also is indicative of the presence of anti-HEV US-1 antibody in the test sample. It is 
contemplated that similar protocols may be used to identify the presence in a test sample of 
other hepatitis E viruses falling within the US-type or US-subtype classes. 

15 In yet another detection method, the antibodies of the present invention may be 

employed to detect the presence of US-type or US-subtype hepatitis E specific antigens in fixed 
tissue sections, as well as fixed cells by immunohistochemical analysis. Cytochemical analysis 
wherein these antibodies are labeled directly with a detectable moiety (e.g., fluorescein, 
colloidal gold, horseradish peroxidase, alkaline phosphatase, etc.) or are labeled indirectly, for 

20 example, by means of a secondary antibody labeled with a detectable moiety also may be used 
in the practice of the invention. 

In another assay format, the presence of antibody and/or antigen can be detected by 
means of a simultaneous assay, for example, as described in EP Publication No. 0 473 065. For 
example, a test sample is contacted simultaneously with (i) a capture reagent of a first analyte, 
25 wherein the capture reagent comprises a first binding member specific for a first analyte 
immobilized on a solid support and (ii) a capture reagent for a second analyte, wherein the 
capture reagent comprises a first binding member for a second analyte immobilized on a second 
different solid support, to produce a mixture. The mixture then is incubated for a time and 
under conditions sufficient to form capture reagent/first analyte and capture reagent/second 
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analyte complexes. The complexes so-formed then are contacted with a first indicator reagent 
comprising a member of a binding pair specific for the first analyte labeled with a detectable 
moiety and a second indicator reagent comprising a member of a binding pair specific for the 
second analyte labeled with a detectable moiety, to produce a second mixture. The second 
5 mixture then is incubated for a time and under conditions sufficient to produce both capture 
reagent/first analyte/first indicator reagent and capture reagent/second analyte/second indicator 
reagent complexes. The presence of one or more analytes is determined by detecting a signal 
generated by the complexes formed on either or both solid phases as an indication of the 
presence of one or more analytes in the test sample. 

10 Other assay systems may employ an antibody which specifically binds US-type or US- 

subtype hepatitis E viral particles or sub-viral particles encapsulating the viral genome (or 
fragments thereof) by virtue of a contact between the specific antibody and the viral protein 
(peptide, etc.). The captured particles then can be analyzed by methods such as LCR or PCR to 
determine whether the viral genome is present in the test sample. The advantage of utilizing 

15 such an antigen capture amplification method is that it can separate the viral genome from other 
molecules in the test specimen by use of a specific antibody. Such a method has been described 
in EP 0 672 176, published September 20, 1995. 

In general, immunoassay design considerations include preparation of antibodies (e.g., 
monoclonal or polyclonal antibodies or antigen binding fragments thereof) having sufficiently 
20 high binding specificity for the target protein to form a complex that can be distinguished 

reliably from products of nonspecific interactions. Typically, the higher the antibody binding 
specificity, the lower the concentration of target that can be detected. 

Both the polypeptide and antibody reagents of the invention may be used to develop 
assays as described herein to detect either the presence of an antigen from or an antibody that 
25 binds to a US-type or US-subtype hepatitis E virus. In addition to their use in immunoassays, it 
is contemplated that the aforementioned polypeptides may be used either alone or in 
combination with adjuvants for use in the production of antibodies in laboratory animals, or 
similarly, used in combination with pharmaceutically acceptable carriers as vaccines for either 
the prophylactic or therapeutic immunization of individuals. Also, it is contemplated that, in 



addition to their use in immunoassays, the antibodies of the invention may be used in 
combination with, for example, a pharmaceutical^ acceptable carrier for use in passive, 
therapeutic or prophylactic immunization of an individual. These latter uses are described in 
more detail in section (III) below. The antibodies of the invention can also be used for the 
5 generation of chimeric antibodies for therapeutic use, or other similar applications. 

Kits suitable for immunodiagnosis and containing the appropriate reagents may be 
constructed by packaging the appropriate materials, including, for example, a polypeptide 
defining a specific epitope of interest or antibodies that bind such epitopes in suitable 
containers. In addition, the kit optionally may include additional reagents, for example, 
10 suitable detection systems and buffers. 

In addition, these antibodies, preferably monoclonal, can be bound to matrices similar to 
CNBr-activated Sepharose and used for the affinity purification of US-type or US-subtype 
hepatitis E specific proteins from cell cultures, or biological tissues such as blood and liver such 
as to purify recombinant and native viral antigens and proteins. 

15 

II. (ii) Nucleic Acid-based Assays 

It is contemplated that if the marker is a US-type or US-subtype specific nucleotide 
sequence, the binding partner preferably also is a nucleotide sequence or an analog thereof that 
hybridizes specifically to the marker sequence or to regions adjacent thereto. Based on the 

20 unique polynucleotide sequences disclosed herein, it is contemplated that a binding partner may 
be a nucleotide sequence complementary to a US-type or US-subtype specific nucleotide 
sequence, for example, a nucleotide sequence or analog thereof complementary to at least a 
portion of an ORF 1 sequence, an ORF 2 sequence, or an ORF 3 sequence of a US-type or US- 
subtype hepatitis E virus, which is unique when compared to the corresponding nucleotide 

25 sequences of the Burmese and Mexican families. Furthermore, it is contemplated that 

noncoding portions of the genome of US-type and US-subtype hepatitis E viruses which are 
unique relative to the genomes of the Burmese and Mexican families of hepatitis E also may 
provide useful markers in the practice of the invention. Such nucleotide sequences (either 



34 



primers or probes) are of a length which allow detection of US-type or US-subtype specific 
sequences by hybridization and/or amplification and may be prepared using routine, standard 
methods, including automated oligonucleotide synthesis methodologies, well known and 
thoroughly discussed in the art. A complement of any unique portion of the HEV US-1 
5 genome will be satisfactory. Complete complementarity is desirable for use as probes, 
although it may be unnecessary as the length of the fragment is increased. 

Similarly, it is contemplated that the binding partner may be a polynucleotide sequence, 
for example, a DNA, RNA or PNA sequence, preferably comprising 8-100 nucleotides more 
preferably comprising 10-75 nucleotides and most preferably comprising 15-50 nucleotides, 

10 which is capable of hybridizing specifically to the target sequence. It is understood that the 
target sequence may be a nucleotide sequence defining at least a portion of a genome of a US- 
type or US-subtype hepatitis E virus, or a sequence complementary thereto. It is known in the 
art that the particular stringency conditions selected for a hybridization reaction depend largely 
upon the degree of complementarity of the binding partner nucleic acid sequence with the target 

15 sequence, the composition of the binding sequence and the length of the binding sequence. The 
parameters for determining stringency conditions are well known to those of ordinary skill in 
the art or are deemed to be readily ascertained from standard textbooks (see for example, 
Maniatis et al., Molecular Cloning: A Laboratory Manual , (Cold Spring Harbor Press, N.Y., 
1989)). 

20 The sequences provided herein may be used to produce probes which can be used in 

assays for the detection of nucleic acids in test samples. The probes may be designed from 
conserved nucleotide regions of the polynucleotides of interest or from non-conserved 
nucleotide regions of the polynucleotide of interest. The design of such probes for optimization 
in assays is within the skill of the routineer. Generally, nucleic acid probes are developed from 

25 non-conserved or unique regions when maximum specificity is desired, and nucleic acid probes 
are developed from conserved regions when assaying for nucleotide regions that are closely 
related to ? for example, different members of a multigene family or in related species like 
mouse and man. 
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One preferred protocol provides a method of detecting the presence or absence of a 
US-type or US-subtype hepatitis E virus in a test sample. The method comprises the steps of 
(a) providing a probe comprising a polynucleotide sequence containing at least 1 5 contiguous 
nucleotides from a US-type or US-subtype isolate, wherein the sequence is not present in other 
5 members of the hepatitis E Burmese and Mexican families; (b) contacting the test sample and 
the probe under conditions that permit formation of a polynucleotide duplex between the probe 
and its complement, in the absence of substantial polynucleotide duplex formation between the 
probe and non US-type and non US-subtype hepatitis polynucleotide sequences present in the 
test sample; and (c) detecting the presence of any polynucleotide duplexes containing the probe. 

10 Preferred nucleotide sequences may comprise nucleotide residue numbers 1 through 

5097 of SEQ ID NO:89, or a naturally occurring sequence variant thereof. With regard to this 
sequence, the term "a naturally occurring sequence variant" includes any nucleic acid sequence 
that is at least 73.3%, preferably at least 75.3%, more preferably at least 783%, and most 
preferably at least 95% identical to residues 1 through 5097 of SEQ ID NO:89. Other preferred 

15 marker or binding partner sequences may comprise nucleotide residue numbers 5132 through 
71 14 of SEQ ID NO:89, or a naturally occurring sequence variant thereof. With regard to this 
sequence, the term "naturally occurring sequence variant" includes any nucleic acid sequence 
that is at least 87.4%, preferably at least 89.4%, more preferably at least 92.4%, and most 
preferably at least 95% identical to residues 5132 through 71 14 of SEQ ID NO:89. Other 

20 preferred marker or binding partner sequences may comprise nucleotide residue numbers 5094 
through 5462 of SEQ ID NO:89, or a naturally occurring sequence variant thereof. With regard 
to this sequence, the term "naturally occurring sequence variant" includes any nucleic acid 
sequence that is at least 88.3% identical, preferably at least 90.3% identical, more preferably at 
least 93.3% identical, and most preferably at least 95% identical to residues 5094 through 5462 

25 of SEQIDNO:89. 

Furthermore, it is contemplated that useful nucleotide sequences may include, for 
example, portions of the ORF 1 sequence encoding, for example, a protein selected from the 
group consisting of the methyltransferase protein, the protease protein, the Y domain protein, 
the X domain protein, the helicase protein, the hypervariable region protein and the RNA- 
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dependent RNA polymerase protein, or a variant thereof. Accordingly, it is contemplated that a 
useful methyltransferase encoding region of ORF 1 preferably has at least 78%, more 
preferably has at least 80%, and most preferably has at least 83% identity to residues 1-693 of 
SEQ ID NO:89. Also, it is contemplated that a useful protease encoding region of ORF 1 
5 preferably has at least 66. 1 %, more preferably has at least 68. 1 %, and most preferably has at 
least 71.1% identity to residues 1270-2091 of SEQ ID NO:89. Also, it is contemplated that a 
useful Y domain encoding region of ORF 1 has at least 80%, more preferably has at least 82%, 
and most preferably has at least 85% identity to residues 619-1272 of SEQ ID NO:89. Also, it 
is contemplated that a useful X domain encoding region of ORF 1 has at least 73.5%, more 

10 preferably has at least 75.5%, and most preferably has at least 78.5% identity to residues 2365- 
2841 of SEQ ID NO:89. Also, it is contemplated that a useful helicase encoding region of ORF 
1 has at least 77.5%, and most preferably has at least 79.5%, and most preferably has at least 
81.5% identity to residues 2893-3591 of SEQ ID NO:89. Also, it is contemplated that a useful 
hypervariable region encoding region of ORF 1 has at least 51 .2%, more preferably has at least 

15 53.2%, and most preferably has at least 56.2% identity to residues 2092-2364 of SEQ ID 
NO:89. Also, it is contemplated that a useful RNA-dependent RNA polymerase encoding 
region of ORF 1 has at least 76.3%, more preferably has at least 78.3%, and most preferably 
has at least 81 .3% identity to residues 3634-5094 of SEQ ID NO:89. 

Preferred nucleotide sequences may comprise nucleotide residue numbers 36 through 
20 5 1 62 of SEQ ID NO: 1 64, or a naturally occurring sequence variant thereof. With regard to this 
sequence, the term "a naturally occurring sequence variant" includes any nucleic acid sequence 
that is at least 73.6%, preferably at least 75.6%, more preferably at least 78.6% and more 
preferably at least 95% identical to residues 36 through 5162 of SEQ ID NO: 164. Other 
preferred marker or binding partner sequences may comprise nucleotide residue numbers 5 1 97 
25 through 7 1 79 of SEQ ID NO : 1 64, or a naturally occurring sequence variant thereof. With 
regard to this sequence, the term "naturally occurring sequence variant" includes any nucleic 
acid sequence that is at least 80.7%, preferably at least 82.7%, more preferably at least 85.7% 
and most preferably at least95% identical to residues 5197 through 7179 of SEQ ID NO: 164. 
Other preferred marker or binding partner sequences may comprise nucleotide residue numbers 
30 5159 through 5527 of SEQ ID NO: 164, or a naturally occurring sequence variant thereof. With 
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regard to this sequence, the term "naturally occurring sequence variant" includes any nucleic 
acid sequence that is at least 87.9% identical, preferably at least 89.9% identical, more 
preferably at least 92.9% identical and even more preferably at least 95% identical to residues 
5159 through 5527 of SEQ ID NO: 164. 

5 Furthermore, it is contemplated that useful HEV US-2 nucleotide sequences may 

include, for example, portions of the ORF 1 sequence encoding, for example, at least a portion 
of a protein selected from the group consisting of the methyltransferase protein, the protease 
protein, the Y domain protein, the X domain protein, the helicase protein, the hypervariable 
region protein and the RNA-dependent RNA polymerase protein, or a variant thereof 

10 Accordingly, it is contemplated that a useful methyltransferase encoding region of ORF 1 

preferably has at least 79.5%, more preferably has at least 81.5%, and most preferably has at 
least 84.5% identity to residues 36-755 of SEQ ID NO: 164. Also, it is contemplated that a 
useful protease encoding region of ORF 1 preferably has at least 66.1%, more preferably has at 
least 68.1%, and most preferably has at least 71.1% identity to residues 1332-2153 of SEQ ID 

15 NO: 164. Also, it is contemplated that a useful Y domain encoding region of ORF 1 has at least 
80.7%, more preferably has at least 82.7%, and most preferably has at least 85.7% identity to 
residues 680-1334 of SEQ ID NO: 1 64. Also, it is contemplated that a useful X domain 
encoding region of ORF 1 has at least 73.7%, more preferably has at least 75.7%, and most 
preferably has at least 78.7% identity to residues 2430-2906 of SEQ ID NO: 164. Also, it is 

20 contemplated that a useful helicase encoding region of ORF 1 has at least 76.4%, and most 

preferably has at least 78.4%, and most preferably has at least 81.4% identity to residues 2958- 
3656 of SEQ ID NO: 164. Also, it is contemplated that a useful hypervariable region encoding 
region of ORF 1 has at least 50.4%, more preferably has at least 52.8%, and most preferably 
has at least 55.8% identity to residues 2154-2429 of SEQ ID NO:164. Also, it is contemplated 

25 that a useful RNA-dependent RNA polymerase encoding region of ORF 1 has at least 76.8%, 
more preferably has at least 78.8%, and most preferably has at least 81.8% identity to residues 
3699-5159 of SEQ ID NO: 164. 
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Other useful nucleotide sequences comprise the nucleotide sequences that encode the 
amino acid sequences selected from the group consisting of SEQ ID NOS:93, 168, 173, 174, 
175, 176, 223, and 224 and nucleotide sequences complementary thereto. 

It is contemplated that the nucleic acid sequences provided herein may be used to 
5 determine the presence of US-type or US-subtype hepatitis E virus in a test sample by 
conventional nucleic acid based assays, for example, by polymerase chain reaction (PCR) 
and/or by blot hybridization studies (described in detail below). In addition to their use in 
nucleic acid based assays, it is contemplated the aforementioned nucleic acid sequences may be 
integrated in vectors which may then be transformed or transfected into a host cell of interest, 
10 for example, vaccinia or mycobacteria. The resulting host cells may then be combined with a 
pharmaceutically acceptable carrier and used, for example, as a recombinant vaccine for 
immunizing a mammal, either prophylactically or therapeutically, against a preselected US-type 
or US-subtype hepatitis E virus. 

The polymerase chain reaction (PCR) is a technique for amplifying a desired nucleic 
15 acid sequence (target) contained in a nucleic acid or mixture thereof In PCR, a pair of primers 
typically are employed in excess to hybridize at the outside ends of complementary strands of 
the target nucleic acid. The primers are each extended by a polymerase, for example, a 
thermostable polymerase, using the target nucleic acid as a template. The extension products 
become target sequences themselves, following dissociation from the original target strand. 
20 New primers then are hybridized and extended by a polymerase, and the cycle is repeated to 
geometrically increase the number of target sequence molecules. PCR is disclosed in U.S. 
patents 4,683,195 and 4,683,202. 

The Ligase Chain Reaction (LCR) is an alternate method for nucleic acid amplification. 
In LCR, probe pairs are used which include two primary (first and second) and two secondary 
25 (third and fourth) probes, all of which are employed in molar excess of the target nucleic acid 
sequence. The first probe hybridizes to a first segment of the target strand and the second probe 
hybridizes to a second segment of the target strand, the first and second segments being 
contiguous so that the primary probes abut one another in 5 ? phosphate-3'hydroxyl relationship, 
and so that a ligase can covalently fuse or ligate the two probes into a fused product. In 
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addition, a third (secondary) probe can hybridize to a portion of the first probe and a fourth 
(secondary) probe can hybridize to a portion of the second probe in a similar abutting fashion. 
Once the ligated strand of primary probes is separated from the target strand, it will hybridize 
with the third and fourth probes which can be ligated to form a complementary, secondary 
5 ligated product. The ligated products are functionally equivalent to either the target or its 
complement. By repeated cycles of hybridization and ligation, amplification of the target 
sequence is achieved. This technique is described more completely in EP-A- 320 308 to K. 
Backman published June 16, 1989 and EP-A-439 182 to K. Backman et al, published July 31, 
1991. 

10 For amplification of mRNAs, it is within the scope of the present invention to reverse 

transcribe mRNA into cDNA followed by polymerase chain reaction (RT-PCR); or, to use a 
single enzyme for both steps as described in U.S. Patent No. 5,322,770; or to reverse transcribe 
mRNA into cDNA followed by asymmetric gap ligase chain reaction (RT-AGLCR) as 
described by R. L. Marshall, et aL, PCR Methods and Applications 4: 80-84 (1994). 

15 Other known amplification methods which can be utilized herein include but are not 

limited to the so-called "NASBA" or "3SR" technique described in Proc. Natl. Acad. Sci. USA 
87: 1874-1878 (1990) and also described in Nature 350 (No. 6313): 91-92 (1991); Q-beta 
amplification as described in published EP 4544610; strand displacement amplification (as 
described in G. T. Walker et al, Clin. Chem. 42: 9-13 [1996]) and EP 684315; and target 

20 mediated amplification, as described by PCT Publication WO 9322461. 

In one embodiment, the present invention generally comprises the steps of contacting a 
test sample suspected of containing a target polynucleotide sequence with amplification 
reaction reagents comprising an amplification primer, and a detection probe that can hybridize 
with an internal region of the amplicon sequences. Probes and primers employed according to 
25 the method herein provided are labeled with capture and detection labels wherein probes are 
labeled with one type of label and primers are labeled with the other type of label. 
Additionally, the primers and probes are selected such that the probe sequence has a lower melt 
temperature than the primer sequences. The amplification reagents, detection reagents and test 
sample are placed under amplification conditions whereby, in the presence of target sequence, 



copies of the target sequence (an amplicon) are produced. The double stranded amplicon then 
is thermally denatured to produce single stranded amplicon members. Upon formation of the 
single stranded amplicon members, the mixture is cooled to allow the formation of complexes 
between the probes and single stranded amplicon members. 

5 After the probe/single stranded amplicon member hybrids are formed, they are detected. 

Standard heterogeneous assay formats are suitable for detecting the hybrids using the detection 
labels and capture labels present on the primers and probes. The hybrids can be bound to a 
solid phase reagent by virtue of the capture label and detected by virtue of the detection label. 
In cases where the detection label is directly detectable, the presence of the hybrids on the solid 

10 phase can be detected by causing the label to produce a detectable signal, if necessary, and 

detecting the signal. In cases where the label is not directly detectable, the captured hybrids can 
be contacted with a conjugate, which generally comprises a binding member attached to a 
directly detectable label. The conjugate becomes bound to the complexes and the conjugates 
presence on the complexes can be detected with the directly detectable label. Thus, the 

15 presence of the hybrids on the solid phase reagent can be determined. Those skilled in the art 
will recognize that wash steps may be employed to wash away unhybridized amplicon or probe 
as well as unbound conjugate. 

Test samples for detecting target sequences can be prepared using methodologies well 
known in the art such as by obtaining a sample and, if necessary, disrupting any cells contained 
20 therein to release target nucleic acids. In the case where PCR is employed in this method, the 
ends of the target sequences are usually known. In cases where LCR or a modification thereof 
is employed in the preferred method, the entire target sequence is usually known. Typically, 
the target sequence is a nucleic acid sequence such as, for example, RNA or DNA. 

While the length of the primers and probes can vary, the probe sequences are selected 
25 such that they have a lower melt temperature than the primer sequences. Hence, the primer 
sequences are generally longer than the probe sequences. Typically, the primer sequences are 
in the range of between 20 and 50 nucleotides long, more typically in the range of between 20 
and 30 nucleotides long. Preferred primer sequences typically are greater than 20 nucleotides 
long. The typical probe is in the range of between 10 and 25 nucleotides long more typically in 
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the range of between 15 and 20 nucleotides long. Preferred probe sequences typically are 
greater than 1 5 nucleotides long. 

Alternatively, a probe may be involved in the amplifying a target sequence, via a 
process known as "nested PCR". In nested PCR, the probe has characteristics which are similar 
5 to those of the first and second primers normally used for amplification (such as length, melting 
temperature etc.) and as such, may itself serve as a primer in an amplification reaction. 
Generally in nested PCR, a first pair of primers (Pi and P2) are employed to form primary 
extension products. One of the primary primers (for example, Pi) may optionally be a capture 
primer (i.e. linked to a member of a first reactive pair), whereas the other primary primer (P2) is 
: ¥ 10 not. A secondary extension product is then formed using a probe (Pl f ) and a probe (P2 f ) which 
CI may also have a capture type label (such as a member of a second reactive pair) or a detection 

j |- label at its 5' end. The probes are complementary to and hybridize at a site on the template 

" near or adjacent the site where the 3' termini of Pi and P2 would hybridize if still in solution. 

!™' Alternatively, a secondary extension product can be formed using the Pi primer with the probe 

15 (P2 f ) or the P2 primer with the probe (Pi ') sometimes referred to as "hemi-nested PCR". Thus, 
a labeled primer/probe set generates a secondary product which is shorter than the primary 
extension product. Furthermore, the secondary product may be detected either on the basis of 
its size or via its labeled ends (by detection methodologies well known to those of ordinary skill 
in the art). In this process, probe and primers are generally employed in equivalent 
20 concentrations. 

Various methods for synthesizing primers and probes are well known in the art. 
Similarly, methods for attaching labels to primers or probes are also well known in the art. For 
example, it is a matter of routine experimentation to synthesize desired nucleic acid primers or 
probes using conventional nucleotide phosphoramidite chemistry and instruments available 
25 from Applied Biosy stems, Inc., (Foster City, CA), Dupont (Wilmington, DE), or Milligen 

(Bedford MA). Many methods have been described for labeling oligonucleotides such as the 
primers or probes of the present invention. Enzo Biochemical (New York, NY) and Clontech 
(Palo Alto, CA) both have described and commercialized probe labeling techniques. For 
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example, a primary amine can be attached to a 3' oligo terminus using 3 T -Amine-ON CPG™ 
(Clontech, Palo Alto, CA). Similarly, a primary amine can be attached to a 5' oligo terminus 

using Aminomodifier II® (Clontech). The amines can be reacted to various haptens using 
conventional activation and linking chemistries. In addition, WO 92/10506, published 25 June 

5 1992 and U. S. Patent 5,290,925, issued March 1, 1994, teach methods for labeling probes at 
their 5' and 3' termini, respectively. In addition, WO 92/1 1388 published 9 July 1992 teaches 
methods for labeling probes at their ends. According to one known method for labeling an 
oligonucleotide, a label-phosphoramidite reagent is prepared and used to add the label to the 
oligonucleotide during its synthesis. See, for example, N.T. Thuong et aL, Tet. Letters 29(46): 

10 5905-5908 (1988); or J. S. Cohen et al, published U.S. Patent Application 07/246,688 (NTIS 
ORDER No. PAT-APPL-7-246,688) (1989). Preferably, probes are labeled at their 3' and 5 1 
ends. 

Capture labels are carried by the primers or probes and can be a specific binding 
member which forms a binding pair with the solid phase reagent's specific binding member. It 

15 will be understood, of course that the primer or probe itself may serve as the capture label. For 
example, in the case where a solid phase reagent's binding member is a nucleic acid sequence, it 
may be selected such that it binds a complementary portion of the primer or probe to thereby 
immobilize the primer or probe to the solid phase. In cases where the probe itself serves as the 
binding member, those skilled in the art will recognize that the probe will contain a sequence or 

20 "tail" that is not complementary to the single stranded amplicon members. In the case where 
the primer itself serves as the capture label, at least a portion of the primer will be free to 
hybridize with a nucleic acid on a solid phase because the probe is selected such that it is not 
fully complementary to the primer sequence. 

Generally, probe/single stranded amplicon member complexes can be detected using 
25 techniques commonly employed to perform heterogeneous immunoassays. Preferably, in this 
embodiment, detection is performed according to the protocols used by the commercially 

available Abbott LCx® instrumentation (Abbott Laboratories, Abbott Park, IL). 
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Other useful procedures known in the art include solution hybridization, and dot and slot 
blot hybridization protocols. The amount of the target nucleic acid present in a sample 
optionally may be quantitated by measuring the radioactivity of hybridized fragments, using 
standard procedures known in the art. 

5 III. Vaccines 

It is contemplated that vaccines may be prepared from one or more immunogenic 
polypeptides based on US-type and/or US-subtype specific protein sequences or antibodies that 
bind to such protein sequences. In addition, it is contemplated that vaccines also may comprise 
dead, live but attenuated US-type or US-subtype hepatitis E virus, or a live, recombinant 
10 vaccine comprising a heterologous host cell, for example, a vaccinia virus, expressing a US- 
type or US-subtype hepatitis E virus specific antigen. 

With regard to the polypeptide based vaccines, the polypeptide must define at least one 
epitope. It is contemplated, however, that the vaccine may comprise a plurality of different 
epitopes which are defined by one or more polypeptide chains. Furthermore, it is contemplated 

15 that nonstructural proteins as well as structural proteins may provide protection against viral 
pathogenicity, even if they do not cause the production of neutralizing antibodies. Considering 
the above, multivalent vaccines against the US-type or US-subtype virus may comprise one or 
more structural proteins, and/or one or more nonstructural proteins. These immunogenic 
epitopes can be used in combinations, i.e., as a mixture of recombinant proteins, synthetic 

20 peptides and/or polypeptides isolated from the virion; which may be co-administered at the 
same or administered at different time. 

Methodologies for the preparation of protein or peptide based vaccines which contain at 
least one immunogenic peptide as an active ingredient are well known in the art. Typically, 
such vaccines are prepared as injectables, either as liquid solutions or suspensions. The 
25 preparation may be emulsified or the protein may be encapsulated in liposomes. The active 
immunogenic ingredients may be mixed with pharmacologically acceptable excipients which 
are compatible with the active ingredient. Suitable excipients include, without limitation, 
water, saline, dextrose, glycerol, ethanol or a combination thereof. The vaccine also may 
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contain small amounts of auxiliary substances such as wetting or emulsifying reagents, pH 
buffering agents, and/or adjuvants which enhance the effectiveness of the vaccine. For 
example, such adjuvants can include aluminum hydroxide, N-acetyl-muramyl-L-threonyl-D- 
isoglutamine (thr-DMP), N-acetyl-nomuramyl-L-alanyl-D-isoglutamine (CGP 11687, also 
5 referred to as nor-MDP), N-acetyl-muramyul-L-alanyl-D-isoglutaminyl-L-alanine-2-(r2 ? - 
dipalmitoyl sn-glycero-3-hydroxphosphoryloxy)-ethylamine (CGP 1983 5 A, also referred to as 
MTP-PE), and RIBI (MPL + TDM + CWS) in a 2% squalene/Tween-80® emulsion. The 
effectiveness of an adjuvant may be determined by measuring the amount of antibodies directed 
against an immunogenic polypeptide containing a US-type or US-subtype specific antigenic 
10 sequence resulting from administration of this polypeptide in vaccines which also comprise 
various adjuvants under investigation. 

The vaccines usually are administered by intravenous or intramuscular injection. 
Additional formulations which are suitable for other modes of administration include 
suppositories and, in some cases, oral formulations. For suppositories, traditional binders and 

15 carriers may include but are not limited to polyalkylene glycols or triglycerides. Such 

suppositories may be formed from-mixtures containing the active ingredient in the range of 
from about 0.5% to about 10%, preferably, from about 1% to about 2% (w/w). Oral 
formulation may include excipients including, for example, mannitol, lactose, starch, 
magnesium stearate, sodium saccharine, cellulose, magnesium carbonate and the like. These 

20 compositions may take the form of solutions, suspensions, tablets, pills, capsules, sustained 
release formulations or powders and contain about 10% to about 95% of active ingredient, 
preferably about 25% to about 70% (w/w). 

The polypeptide chains used in the vaccine may be formulated into the vaccine as 
neutral or salt forms. Pharmaceutically acceptable salts include, for example, acid addition 
25 salts formed by the addition of inorganic acids such as hydrochloric or phosphoric acids, or 
such organic acids such as acetic, oxalic, tartaric, maleic, or other acids known to those skilled 
in the art. Salts formed with the free carboxyl groups also may be derived from inorganic bases 
such as sodium, potassium, ammonium, calcium or ferric hydroxides and the like, and organic 
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bases such as isopropylamine, trimethylamine, 2-ethylamino ethanol, histidine procaine, or 
other bases known to those skilled in the art. 

Vaccines typically are administered in a way compatible with the dosage formulation, 
and in such amounts that will be effective prophylactically and/or therapeutically. The quantity 
5 to be administered generally ranges from about 5 jig to about 250 |ig of antigen per dose, 

however the actual dose will depend upon the health and size of the subject, the capacity of the 
subject's immune system to synthesize antibodies, and the degree of protection sought. The 
vaccine may be given in a single or multiple dose schedule. A multiple dose is one in which a 
primary course of vaccination may be with one to ten separate doses, followed by other doses 
10 given at subsequent time intervals required to maintain and/or to reinforce the immune 
response, for example, at one to four months for a second dose, and if required by the 
individual, a subsequent dose(s) several months later. In addition, the dosage regimen may be 
determined, at least in part, by the need of the individual, and may be dependent upon the 
practitioner' s judgment. 

15 With regard to dead or otherwise inactivated US-type or US-subtype hepatitis E virus 

containing vaccines, inactivation may be facilitated using conventional methodologies well 
known and thoroughly documented in the art. Preferred inactivation methods include, for 
example, exposure to one or more of (i) organic solvents, (ii) detergents, (iii) formalin, and (iv) 
ionizing radiation. It is contemplated that some of the proteins in attenuated vaccines may 

20 cross-react with other known viruses, and thus shared epitopes may exist between a US-type or 
US-subtype hepatitis E virus and other members of the HEV family (for example, members of 
the Burmese or Mexican families) and thus give rise to protective antibodies against one or 
more of the disorders caused by these pathogenic agents. Preferred formulations and modes of 
administration are thoroughly documented in the art and so are not discussed in detail herein. 

25 The various factors to be considered may include one or more features discussed hereinabove 
for the peptide based vaccines. 

With regard to the live, but attenuated vaccines, it may be possible to produce 
attenuated virus using any of the attenuation methods known and used in the art. Briefly, 
attenuation may be accomplished by passage of the virus at low temperatures or by introducing 



46 

missense mutations or deletions into the viral genome. Preferred formulations and modes of 
administration are thoroughly documented in the art and so are not discussed in detail herein. 
The various factors to be considered may include one or more features discussed hereinabove 
for the peptide based vaccines. 

5 With regard to live, recombinant vaccines (vector vaccines), these may be developed by 

incorporating into the genome of a living but harmless virus or bacterium, a gene or nucleic acid 
sequence encoding a US-type or US-subtype hepatitis E specific polypeptide chain defining an 
antigenic determinant. The resulting vector organism may then be administered to the intended 
host. Typically, for such a vaccine to be successful, the vector organism must be viable, and 

10 either naturally non-virulent or have an attenuated phenotype. Preferred host organisms include, 
vaccinia virus, adenovirus, adeno-associated virus, salmonella and mycobacteria. Live strains of 
vaccinia virus and mycobacteria have been administered safely to humans in the forms of the 
smallpox and tuberculosis (BCG) vaccines, respectively. In addition, they have been shown to 
express foreign proteins and exhibit little or no conversion into virulent phenotypes. Vector 

15 vaccines are capable of carrying a plurality of foreign genes or nucleic acid sequences thereby 
permitting simultaneous vaccination against a variety of preselected antigenic determinants. 
Preferred formulations and modes of administration are thoroughly documented in the art and so 
are not discussed in detail herein. 

20 IV. Identification of molecules with anti-US-tvpe or anti-US-subtvpe hepatitis E 
virus activity. 

In view of the discovery of specific HEV US-type sequences, it is contemplated that one 
skilled in the art may be able to identify molecules which either inactivate or reduce the activity 
of HEV US-type specific proteins, e.g., the helicase, methyltransferase, or protease proteins 
25 encoded by the ORF 1 portions of the HEV genome. An exemplary protocol for identifying 
molecules that inhibit the HCV protease is described in U.S. Patent No. 5,597,691, the 
disclosure of which is incorporated herein by reference. Although, the method pertains to the 



identification of HCV protease inhibitors, it is contemplated that the same or similar protocols 
maybe used to identify HEV protease inhibitors, or any other protein encoded by a HEV US- 
type sequence. 

Briefly, a method for identifying HEV protease inhibitors is as follows. Typically, a 
5 substrate is employed which mimics the proteases natural substrate, but which provides a 
quantifiable signal when cleaved. The signal preferably is detectable by colorimetric or 
fluorometric means; however, other methods such as HPLC or silica gel chromatography, 
nuclear magnetic resonance, and the like may also be useful. After optimum substrate and 
protease concentrations have been determined, candidate protease inhibitors are added one at a 
10 time to the reaction mixture at a range of concentrations. The assay conditions preferably 
resemble the conditions under which the protease is to be inhibited in vivo, i.e., under 
physiologic pH, temperature, ionic strength, etc. Suitable inhibitors exhibit strong protease 
inhibition at concentrations which do not raise toxic side effects in the subject. Inhibitors 
which compete for binding to the protease active site may require concentrations equal to or 
15 greater than the substrate concentration, while inhibitors capable of binding irreversibly to the 
protease active site may be added in concentrations on the order of the enzyme concentration. 

It is contemplated that the inhibitors may be organic compounds, which, for example, 
mimic the cleavage site recognized by the HEV protease, or alternatively, may be proteins, for 
example, antibodies or antibody fragments capable of binding specifically to and inactivating or 

20 reducing the activity of the HEV protease. Once identified, the protease inhibitors may be 
administered by a variety of methods, such as intravenously, orally, intramuscularly, 
intraperitoneally, bronchially, intranasally, and so forth. The preferred route of administration 
will depend upon the nature of inhibitor. Inhibitors prepared as organic compounds may be 
administered orally (which is generally preferred) if well absorbed. Protein-based inhibitors 

25 (such as most antibodies or antibody derivatives) generally are administered by parenteral 
routes. 
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Examples 

Practice of the invention will be more fully understood from the following examples, 
which are presented herein for illustrative purposes only, and should not be construed as 
limiting the invention in any way. All citations to the literature, both supra and infra, including 
5 patents, patent applications and scientific publications are incorporated by reference herein, in 
their entirety. 

Example 1 - Case study 

HEV strain US-1 was identified in the serum of a patient (USP-1) suffering from acute 
hepatitis. The patient was a 62 year old, white male who was hospitalized in Rochester, MN 
10 after a three-week history of fever, abdominal pain, jaundice, and pruritis. Onset of signs and 
symptoms began two weeks after returning home following a ten day trip to San Jose, 
California. 

His past medical history included a nephrectomy for autosomal dominant polycystic 
kidney disease accompanied by mild renal insufficiency, and a laparoscopic cholecystectomy 

15 for symptomatic cholelithiasis. The patient had osteoanthritis and was hypertensive. 

Lisinopnil therapy had been initiated three months prior to admission. Physical examination 
revealed an ill appearing icteric white male with an enlarged tender liver, and no asterixis. 
Serum aspartate aminotransferase (AST), alanine aminotransferase (ALT), and bilirubin levels 
were markedly elevated at the time of hospital admission and peaked 8 days and 16 days after 

20 hospitalization, respectively (Figure 2). Lisinopril was discontinued on admission. Serologies 
for hepatitis A (IgM and IgG anti-HAV), hepatitis B (HBsAg, IgM and IgG anti-HBc), hepatitis 
C (anti-HCV), and HCV RNA were negative. Ceruloplasmin, iron, transferrin, anti-nuclear and 
anti-smooth muscle antibodies, toxin and drug screen were all normal. Careful questioning of 
the patient revealed no history of ethanol use. Abdominal ultrasound and computed tomography 

25 scan, and endoscopic retrograde cholangiopancreatogram were also normal. A liver biopsy 

showed a severe, acute lobular hepatitis with striking pyknotic and ballooning degeneration of 
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hepatocytes consistent with autoimmune, drag, or viral hepatitis. 

The patient made a complete clinical recovery within 2 months, with normalization of 
AST, ALT, and bilirubin noted about 5 months after hospital admission. No risk factors for 
acquiring HEV were identified. He had not traveled outside the US for over 1 0 years. In the 6 
weeks prior to illness onset, the only meals he reported eating that were not prepared at home 
were at a Mexican restaurant and a large fast food restaurant chain. He had no exposure to 
untreated drinking water, did not report eating raw shellfish, and had no known exposure to 
farm animals. None of the food handlers at the Mexican restaurant or the fast food restaurant 
reported foreign travel since less than 5 months from admission date and none reported signs 
and/or symptoms of hepatitis. No other cases of non- ABC hepatitis were reported in the 
county health department where the patient stayed in California, and where the patient lived in 
Minnesota during the period of admission. No family members had signs and/or symptoms of 
hepatitis either during the patient's trip to California or in the subsequent 10 weeks. Serum 
obtained from 6 family members in California, and from his spouse who lived with him in 
Minnesota over the period of interest were negative for anti-HEV by EI A. 

Example 2 - Identification of unique isolate of HEV US-1 

The presence of HEV was determined by RT-PCR using HEV primer sequences. 
Briefly, nucleic acids were isolated from 25 \xL of serum from patient USP-1 as previously 
described (Schlauder et al (1995) J. Virological Methods 46: 81-89). Ethanol precipitated 
nucleic acids were resuspended in 3 \\L of diethyl pyrocarbonate (DEPC) treated water. 

cDNA synthesis and PCR were performed using the GeneAmp RNA PCR kit from 
Perkin-Elmer (Norwalk, CT) in accordance with the manufacturer's instructions. RNA (1 \xL) 
was used as a template for each 10 |iL cDNA reaction. cDNA synthesis was primed with 
specific primers added to a final concentration of 4 |iM. The subsequent amplification of cDNA 
was primed with oligonucleotides added to a final concentration of 0.8 to 1 .0 \iM, PCR was 
performed for 40 cycles (94°C, 20 sec; 55°C, 30 sec; 72°C, 30 sec; followed by an extension 
cycle of 72°C for 3 min). The initial PCR reaction (2 ^iL) then was used as a template for a 
second round of amplification using a nested set of PCR primers. PCR was performed using 
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the GeneAmp PCR kit from Perkin-Elmer in accordance with the manufacturer's instructions. 
Briefly, primers were added to a final concentration of 1 (iM. The initial set of experiments 
used three sets of primers. Two from the 5'-end of ORF 1 based on sequences from the 
Burmese and Mexican strains. One set from the 3 '-end of ORF 1 based on the Mexican strain 
sequence. The three sets of primers used were as follows: 

Primer Set 1 



Primer 

5'-ORF 1 -Mexican primer C375M 
PCR primer A1-350M 
PCR primer S1-34M 
nested PCR primer A2-320M 
nested PCR primer S2-55M 



Sequence SEQ ID NO: 

CTGAACATCCCGGCCGAC SEQ ID NO: 1 

AGAAAGCAGCGATGGAGGA SEQ ID NO:2 

GCCCACCAGTTCATTAAGGCT SEQ ID NO:3 

TCATTAATGGAGCGTGGGTG SEQ ID NO:4 

CCTGGCATCACTACTGCTAT SEQ ID NO:5 



Primer Set 2 



Primer 

5'-ORF 1- Burmese cDNA primer C375 
PCR primer Al-350 
PCR primer SI -34 
nested PCR primer A2-320 
nested PCR primer S2-55 



Sequence SEQ ID NO: 

CTGAACATCACGCCCAAC SEQ ID NO:6 

AGGAAGCAGCGGTGGACCA SEQ ID NO:7 

GCCCATCAGTTTATTAAGGC SEQ ID NO:8 

TCATTTATTGAGCGGGGATG SEQ ID NO:9 

CCTGGCATCACTACTGCTAT SEQ ID NO: 1 0 



Primer Set 3 



Primer 

3' -ORF 1- Mexican cDNA primer M1PR6 
PCR primer S4294M 
nested PCR primer Ml PF6 
nested PCR primer A4556 



Sequence SEP ID NO: 

CCATGTTCCACACCGTATTCCAGAG SEQ ID NO:l 1 

GTGTTCTACGGGGATGCTTATGACG SEQ ID NO: 12 

GACTCAGTATTCTCTGCTGCCGTGG SEQ ID NO: 1 3 

GGCTCACCAGAATGCTTCTTCCAGA SEQ ID NO: 14 
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The 5'-ORF 1 -Burmese primers are described in Schlauder et al (1993) Lancet 341 : 
378. Primers M1PR6 and M1PF6 are described in McCaustland et al (1991) J. Virological 
Methods 35: 33 1-342. The PCR products were separated by agarose gel electrophoresis and 
visualized by UV irradiation after ethidium bromide staining. The resulting PCR products were 
5 hybridized to a radiolabeled probe after Southern blot transfer to a nitrocellulose filter. 

Radiolabeled probes were generated from PCR products purified with the QIAEX gel 
extraction purification kit by Qiagen (Chatsworth, CA). Radiolabel was incorporated using the 
Stratgene® (La Jolla, CA) Prime-It II kit according to the manufacturer's instructions. Filters 
were prehybridized in Rapid-hyb buffer from Amersham (Arlington Heights, IL) for 3-5 hours, 
and then hybridized in Fast-Pair Hybridization Solution with 100-200 cpm/cm2 at 42°C for 15- 
25 hours. Filters then were washed as described in Schlauder et al (1992) J. Virol. Methods 37: 
189-200. Phosphorimages of the probed filters were obtained with a Molecular Dynamics 
Phosphorimager 425E (Sunnyvale, CA). 

Ethidium bromide stained bands were detected with the primers from the 5' -end of ORF 
1 . However, only the primers based on the Mexican strain resulted in a nested product of the 
expected size of 266 base pairs. Hybridization to a probe derived from a Burmese-like strain 
(identity > 90%) infected patient resulted in a very weak hybridization signal to the patient 
USP-1 derived products relative to the signal from the Burmese positive control. These results 
gave the first indication that this isolate was not closely related to the Burmese isolate. No 
probe was available from the Mexican strain. 

To confirm these results, RNA was extracted from additional serum aliquots of patient 
USP-1. RT-PCR was performed using the 5'-ORF 1-Mexican primers, SEQ ID NOS:l-5, as 
described above. Following agarose gel electrophoresis and staining with ethidium bromide, a 
342 bp product was visualized in each sample. The PCR products were extracted from the 
25 agarose gel using the QIAEXII Agarose Gel Extraction Kit by Qiagen (Chatsworth, CA) and 
cloned into pT7 Blue T-vector plasmid by Novagen (Madison, WI). The cloned products were 
sequenced using the SEQUENASE VERSION 2.0 sequencing kit (USB, Cleveland, OH) in 
accordance with the manufacturers instructions. 
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The nucleotide sequences obtained from the product of the latter two samples were 
identical and are shown in SEQ ID NO: 15. These results indicate that only the cDNA primer 
and primer SI from both the Burmese and Mexican strains resulted in an ethidium bromide 
stainable product from the patient USP-1 samples. Only the Mexican strain based nested 
5 primers, S2 and A2 generated an ethidium bromide stainable product of the expected size. 

In order to determine the degree of relatedness between the HEV US-1 isolate and other 
known isolates of HEV, alignments of the nucleotide and amino acid sequences were 
performed using the program GAP of the Wisconsin Sequence Analysis Package (Version 9), 
available from the Genetics Computer Group, Inc., 575 Science Drive, Madison, Wisconsin, 
10 5371 1 . The program employs the algorithm of Needleman and Wunsch (J. Mol. Biol. (1970) 
48:443-453) to calculate the degree of similarity and identity, which are expressed as 
percentages between the two sequences being aligned. The gap creation and gap extension 
penalties were 50 and 3.0, respectively, for nucleic acid sequence alignments, and 12 and 4, 
respectively, for amino acid sequence comparisons. 

15 The complete nucleotide and amino acid sequences of the two 'prototype' HEV isolates 

from Burma and Mexico, as well as other sequences used for analyses were obtained from 
GenBank, with their respective accession numbers are indicated in Table 1 below. Each of the 
these sequences are incorporated herein by reference. 

TABLE 1 



Isolate 


Genbank Accession Number 


Mexican (Ml) 


M74506 


Burmese (Bl) 


M73218 


Pakistan (PI) 


M80581 


Chinese (C4) 


D11093 



20 

A 303 base pair sequence of HEV US-1 (homologous to residues 1-303 of SEQ ID 
NO: 89) was compared against the homologous regions identified in the Mexican, Burmese, 
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Pakistani, and Chinese strains. The resulting percent identities are summarized in Table 2 
below. 

TABLE 2. Identity over 303 nucleic acids from the 5'-end ORF 1 product 





US-1 


Mexican 


Burmese 


Pakistan 


Mexican 


77.2 








Burmese 


74.9 


83.2 






Pakistan 


75.9 


83.2 


95.7 




Chinese 


75.9 


83.5 


95.7 


97.4 



The results in Table 2 indicate that the fragment from the 5 5 -end of ORF 1 from the 
5 USP-1 isolate showed a nucleic acid identity from about 74.9 to about 77.2 % relative to other 
known isolates of HEV. This was less than the identity between the prototype Mexican and 
Burmese isolates (83.2%). These results indicate that the product likely was derived from a 
unique isolate of HEV not previously identified. 

Example 3 - Genome Extension and Sequencing of HEV US-1 

10 The clone obtained and sequenced as described in Example 2 (SEQ ID NO: 15) 

hereinabove was derived from a unique HEV genome, HEV US-1. To obtain sequences from 
additional regions of the HEV US-1 genome, several reverse transcriptase-polymerase chain 
reaction (RT-PCR) walking experiments were performed. 

Total nucleic acids were extracted by the procedure described in Example 2 (for SEQ ID 
15 NO: 19 only) or by one of the following procedures. Aliquots (25 \xL) of patient USP-1 serum 
were extracted using the Total Nucleic Acid Extraction procedure in accordance with the 
manufacturers instructions (United States Biochemical) in the presence of 10 mg yeast tRNA as 
carrier. Nucleic acids were precipitated and resuspended in 3.75 ^iL RNase/DNase free water. 
Alternatively, total RN A was isolated from 1 00 |iL of serum using the ToTALL Y RN A 
20 isolation kit as recommended by the manufacturer (Ambion, Inc.). The resulting RNAs were 
treated with DNase and column purified with reagents from S.N.A.P. Total RNA isolation kit 
(Invitrogen, San Diego, CA). Thereafter, RNA was precipitated with 0.1 volumes of 3M 
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sodium acetate, 2 jaL pellet paint (Novagen) as carrier, and 2 volumes ethanol. RNA pellets 
were dissolved in 50 (iL DEPC treated water. 

RT-PCR was performed using the -Gene Amp RNA PCR kit in accordance with the 
manufacturers instructions (Perkin-Elmer). Random hexamers were used to prime cDNA 
5 synthesis in a total volume of 25 \xL except for the isolation of SEQ ID NO: 1 9 which utilized 
cDNA specifically primed with primer PA2-5560 (SEQ ID NO: 16), as described in Example 2 
above. US 1 -gap was generated with specifically primed cDNA generated using RNA extracted 
from 12.5 jxL serum equivalents, primer US1 gap-a0.5 (SEQ ID NO:46), and Superscript II (3* 
RACE Kit: GIBCO BRL). PCR was performed with the cDNA encompassing one-fifth of the 

10 total reaction volume (2 jaL for 10 |iL reaction or 5 jiL for 25 jiL reaction, etc.). Standard PCR 
was performed in the presence of 2 mM MgCl 2 and 0.5 to 1.0 pM of each primer. Modified 
reactions contained lx PCR Buffer and 20% Q Solution (Qiagen) in accordance with the 
manufacturer's instructions for the isolation of SEQ ID NOS:33 and 41. Reactions used two 
HEV consensus primers (Table 3), one HEV consensus primer and one HEV-US-1 specific 

15 primer (Table 4), two HEV US-1 specific primers (Table 5), one HEV US-1 specific primer and 
one HEV US-2 (see Example 5) specific primer (Table 6), or two HEV US-2 specific primers 
(Table 7). Reactions were subjected to thermal cycling as follows: 

SEQ ID NOS:19, 24, 27, 30, 33, 41, 44, 60, 64, 68, 73, 78, and 83 were obtained by 
touchdown PCR. Amplification involved 43 cycles of 94°C for 30 seconds, 55°C for 30 
20 seconds (-0.3°C/cycle), and 72°C for 1 minute. This was followed by 10 cycles of 94°C for 30 
seconds, 40°C for 30 seconds, and 72°C for 1 minute. For SEQ ID NOS:38, 49, 52, and 55, 
cycling involved 35 rounds of 94°C for 30 seconds, 55°C for 30 seconds, and 72°C for 1 
minute. All amplifications were preceded by 1-2 minutes at 94°C and followed by 72°C for 5 
to 1 0 minutes. The reactions were held at 4°C prior to agarose gel analysis. 

25 The isolation of SEQ ID NO: 19 required a second round of touch down amplification to 

isolate the desired product. Here, 1 |uL of first round was placed into a second round 25 ^xL 
reaction. The second round amplification utilized hemi-nested primers as indicated in Table 3 
by reactions 1.1.1 and 1 . 1 .2. The isolation of SEQ ID NO:24 required a second round of nested 
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touch down amplification as described above and indicated in Table 4 as reactions 2.1.1 and 
2.1.2. The isolation of SEQ ID NOS:38 and 49 required a second round of nested PCR (Table 
5) utilizing 1 jaL of first round into a 25 (iL reaction as described above. The isolation of SEQ 
ID NOS:60, 64, 68, and 73 required nested PCR in which 1 jal of the first round was amplified 
5 in a 25 jiL second round reaction (Table 6). Products SEQ ID NOS:78 and 83 were generated 
from two rounds of amplification (Table 7). 

Agarose gel electrophoresis was performed on a fraction or all of the PCR reaction in a 
0.8% to 2% agarose TAE gel in the presence of 0.2 mg/mL ethidium bromide. Products were 
visualized by UV irradiation and products of the desired molecular weight were excised, 

10 purified using GeneClean in accordance with the manufacturers' instructions (BIO 101, Inc.), 
and cloned into pT7-Blue T-Vector plasmid (Novagen) II or pGEM-T Easy Vector (Promega) 
in accordance with the manufacturers' instructions. Cloned products were sequenced as 
described in Example 2 or on a ABI Model 373 DNA Sequencer using ABI Sequencing Ready 
Reaction Kit as specified by the manufacturer. Results of these experiments are presented 

15 hereinbelow in Tables 3, 4, 5, 6, and 7. 



TABLE 3 



Reaction 


Primer 1 


Primer 2 


Approx. Prod. Size/SEQ ID 


1.1.1 


SEQ ID NO: 17 


SEQ ID NO: 16 


none 


1.1.2 


SEQ ID NO: 18 


SEQ ID NO: 16 


251 bp/SEQ ID NO: 19 


1.2 


SEQ ID NO:28 


SEQ ID NO:29 


168 bp/SEQ ID NO:30 



TABLE 4 



Reaction 


Primer 1 


Primer 2 


Approx. Product Size/SEQ ID NO 


2.1.1 


SEQ ID NO:20 


SEQ ID NO:22 


none 


2.1.2 


SEQ ID NO:21 


SEQ ID NO:23 


899 bp/SEQ ID NO:24 


2.2 


SEQ ID NO:25 


SEQ ID NO:26 


846 bp/SEQ ID NO:27 


2.3 


SEQ ID NO:31 


SEQ ID NO:32 


424 bp/SEQ ID NO:33 


2.4 


SEQ ID NO:39 


SEQ ID NO:40 


460 bp/SEQ ID NO:41 


2.5 


SEQ ID NO:42 


SEQ ID NO:43 


235 bp/SEQ ID NO:44 
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TABLE 5 



Reaction 


Primer Set PCR 1 


Primer Set PCR 2 


Approx. Product 
Size/SEQ ID NO: 


3.1 


SEQ ID NO:34/SEQ ID NO:35 


SEQ ID NO:36/SEQ. ID NO:37 


1186 bp/SEQ ID NO:38 


3.2 


SEQ ID NO:45/SEQ ID NO:46 


SEQ ID NO:47/SEQ ID NO:48 


545 bp/SEQ ID NO:49 


3.3 


SEQ ID NO:50/SEQ ID NO:51 




344 bp/SEQ ID NO:52 


3.4 


SEQ ID NO:53/SEQ ID NO:54 




194 bp/SEQ ID NO:55 


TABLE 6 


Reaction 


Primer Set PCR 1 


Primer Set PCR 2 


Approx. Product 
Size/SEQ ID NO: 


4.1 


SEQ ID NO:56/SEQ ID NO:57 


SEQ ID NO:58/SEQ ID NO:59 


464 bp/SEQ ID NO:60 


4.2 


SEQ ID NO:61/SEQ ID NO:62 


SEQ ID NO:63/SEQ ID NO:62 


433 bp/SEQ ID NO:64 


4.3 


SEQ ID NO:65/SEQ ID NO:66 


SEQ ID NO:65/SEQ ID NO:67 


382 bp/SEQ ID NO:68 


4.4 


SEQ ID NO:69/SEQ ID NO:70 


SEQ ID NO:71/SEQ ID NO:72 


451 bp/SEQ IDNO:73 


TABLE 7 


Reaction 


Primer Set PCR 1 


Primer Set PCR 2 


Approx. Product 
Size/SEQ ID NO: 


5.1 


SEQ ID NO:74/SEQ ID NO:75 


SEQ ID NO:76/SEQ ID NO:77 


334 bp/SEQ ID NO:78 


5.2 


SEQ ID NO:79/SEQ ID NO:80 


SEQ ID NO:81/SEQ ID NO:82 


413 bp/SEQ ID NO:83 
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To obtain the sequence at the 3 r end of the genome, amplification utilized the 3' RACE 
System of GIBCO BRL in accordance with the manufacturer's instructions. It was assumed 
that, as an HE V strain, the 3' end of the HEV-US-1 genome would contain a poly-adenosine tail 
similar to the Mexican, Burmese, and Pakistani strains. RNA extracted as described above 
from the equivalent of 50 (iL of serum was reverse transcribed utilizing the oligo dT adapter 
primer 5 ? -GGCCACGCGTCGACTAGTACTTTTTTTTTTTTTTTTT -V of (SEQ ID NO:84) 

supplied by the manufacturer. First round PCR utilized the AUAP primer supplied 5'- 
GGCCACGCGTCGACTAGTAC -3' (SEQ ID NO:85) and a HEV US- specific primer (Table 

8) at 0.2 mM final concentration with PCR Buffer, MgCl 2 , and cDNA concentrations as 
recommended. Amplification involved 35 cycles of 94°C for 30 seconds, 55°C for 30 seconds, 

and 72°C for 1 minute. Amplification was preceded by a 1 minute incubation at 94°C and 
followed by a 72°C, 10 minute extension. A second round of amplification used 1 \xL of first 
round in a 50 |iL reaction. PCR buffer was IX final concentration with 2 mM MgCl 2 , and 0.5 
mM of each of the primers. Primers were hemi-nested with the AUAP primer and a HEV-US-1 
specific primer (Table 8). Amplification conditions were the same as first round. The products 
were analyzed by agarose gel electrophoresis, cloned, and sequenced as above. 

TABLE 8 



Reaction 


Primer Set PCR 1 


Primer Set PCR 2 


Approx. Product 
Size/SEQ ID NO: 


8.1 


SEQ ID NO:86/SEQ ID NO:85 


SEQ ID NO:87/SEQ ID NO:85 


960 bp/SEQ ID NO:88 



The sequences obtained from the products described in Tables 3, 4, 5, 6, 7, and 8 
hereinabove, and the initial PCR product near the 5 f end of the genome, SEQ ID NO: 15, were 
assembled into contigs using the programs of the GCG package (Genetics Computer Group, 
Madison, WI, version 9) and a consensus sequence determined. A schematic of the assembled 
5 contig is presented in Figure 3 . The HEV US-1 genome is 7202 bp in length, all of which has 
been sequenced (SEQ ID NO:89). This sequence was translated into three open reading frames, 
two of which are shown in SEQ ID NO:90 (the third ORF is positioned at nucleotide positions 
5094-5462 but cannot be shown in SEQ ID NO:90 due to overlap with the other two ORFs). 
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The resulting translations (ORF 1, ORF 2, and ORF 3) are set forth in SEQ ID NO:91, SEQ ID 
NO:92, and SEQ ID NO:93, respectively. 

Example 4 - Identification of unique isolate of HEV US-2 

A patient from the US suffering from acute hepatitis, who tested for IgG class antibodies in 
5 the HEV EIA test, also tested positive by means of a US-1 strain-specific ELISA. This patient 
(USP-2) diagnosed with acute hepatitis, was a 62 year old male who was admitted to the 
hospital with jaundice and fatigue. Initial laboratory studies indicated an ALT of 1270 U/L 
(normal 0-40 U/L). Since there was a recent outbreak of hepatitis A virus (HAV) in the area, it 
was suspected that this individual was infected with HAV. However, the anti-HAV IgM test, 
10 HAVAB-M EIA (Abbott Laboratories) was negative as were tests for serologic markers for 
hepatitis B virus and hepatitis C virus. This patient's history included a visit to Cancun, 
Mexico, several weeks prior to the onset of his illness. 

The sample from the patient then was analyzed for the presence of HEV specific sequences 
via PCR amplification using HEV US-1 specific PCR primers. RNA was extracted using 

15 Ultraspec as described in Example 2. Random primed cDNA synthesis was performed as 
described in Example 3 and PCR was performed using standard conditions as described in 
Example 2 with HEV US-1 specific primers SEQ ID NO:94 and SEQ ID NO:96. Nested PCR 
was performed with primers SEQ ID NO:95 and SEQ ID NO:97. Sequencing of the PCR 
product was performed as described in Example 3. The sequence of the resulting PCR product 

20 is set forth in SEQ ID NO:98. GAP analysis as described in Example 2 showed that the 

nucleotide sequence, SEQ ID NO:98 was 95% identical to the corresponding or homologous 
homologous region from HEV US-1. 

Example 5 - Genome Extension and Sequencing of HEV US-2 



25 



The clone obtained and sequenced in Example 4 (SEQ ID NO:98) was derived from a 
HEV isolate most closely related to HEV US-1 . To obtain additional regions of the HEV US-2 
genome, several RT-PCR walking experiments were performed as described in Example 3. 
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RNA was extracted using the Total Nucleic Acid Extraction procedure (United States 
Biochemical). Reverse transcription was random primed using the GeneAmp RNA PCR kit 
(Perkin-Elmer). Standard PCR was performed in the presence of 2 mM MgCl 2 and 0.5 to 1 .0 
]llM of each primer. Modified reactions contained lx PCR Buffer and 20% Q Solution 
5 (Qiagen) for the isolation of SEQ ID NOS:129, 141 and 146. Reactions used two HEV US-1 
specific primers (Table 9), one HEV US-1 specific primer and one HEV consensus primer 
(Table 10), one HEV US-2 specific primer and one HEV consensus primer (Table 11), two 
HEV US-2 specific primers (Table 12), or two Burmese, Mexican, and US derived Consensus 
primers (described hereinbelow, Table 13). 

10 The products shown in SEQ ID NOS:101, 102, 105, 108, 110, 113, 117, 120, 124, 149 

and 151 were obtained by touchdown PCR. Amplification involved 43 cycles of 94°C for 30 
seconds, 55°C for 30 seconds (-0.3°C/cycle), and 72°C for 1 minute. This was followed by 10 
cycles of 94°C for 30 seconds, 40°C for 30 seconds, and 72°C for 1 minute. Cycling involving 
35 cycles of 94°C for 30 seconds, 55°C for 30 seconds, and 72°C for 1 minute was used to 

15 amplify SEQ ID NOS:129, 132, 136, 141 and 146. All amplifications were preceded by 1-2 
minutes at 94°C and followed by 72°C for 5-10 minutes. The reactions were held at 4°C prior 
to agarose gel analysis. Isolation of many products required a second round of nested or hemi- 
nested PCR as shown in Tables 9-13. In these reactions 1 \xL of the PCR1 product was added 
to 25-50 \iL of the PCR2 reaction mixture and the resulting mixture cycled as in PCR1. 

20 Reactions were analyzed and products cloned and sequenced as described in Example 3 

above. The results of these experiments are presented below in Tables 9-13. 
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TABLE 9 



Reaction 


Primer set PCR1 


Primer set PCR2 


Approx. Product 
Size/SEQ ID NO: 


7.1 


SEQ ID NO:99/SEQ ID NO: 100 




331 bp/SEQ ID NO: 101 


7.2 


SEQ ID NO:34/SEQ ID NO.:35 


SEQ ID NO:36/SEQ ID NO.:37 


1186 bp/SEQ ID NO: 102 


7.3 


SEQ ID NO: 1 03/SEQ ID NO: 1 04 




130bp/SEQ ID NO:105 


7.4 


SEQ ID NO: 1 06/SEQ ID NO: 1 07 


SEQ ID NO:39/SEQ ID NO: 107 


564 bp/SEQ ID NO: 108 


7.5 


SEQ ID NO.: 86/SEQ ID NO: 109 


SEQ ID NO:87/SEQ ID NO: 109 


678 bp/SEQ ID NO: 110 


TABLE 10 


Reaction 


Primer set PCR1 


Primer set PCR2 


Approx. Product 
Size/SEQ ID NO: 


8.1 


SEQ ID NO: 1 1 1/SEQ ID NO: 1 12 




580 bp/SEQ ID NO: 113 


8.2 


SEQ ID NO: 1 14/SEQ ID NO: 1 16 


SEQ ID NO: 1 1 6/SEQ ID NO: 1 1 5 


734 bp/SEQ ID NO: 117 


TABLE 11 


Reaction 


Primer set PCR1 


Primer set PCR2 


Approx. Product Size/ 
SEQ ID NO: 


9.1 


SEQ ID NO: 1 1 8/SEQ ID NO: 1 19 




483 bp/SEQ ID NO: 120 


9.2 


SEQ ID NO: 12 1/SEQ ID NO: 122 


SEQ ID NO: 1 2 1 /SEQ ID NO: 1 23 


431 bp/SEQ ID NO: 124 


9.3 


SEQ ID NO:125/SEQ ID NO: 126 


SEQ ID NO:127/SEQ ID NO: 128 


1020 bp/SEQ ID NO: 129 
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TABLE 12 



Reaction 


Primer set PCR1 


Primer set PCR2 


Approx. Product 
Size/SEQ ID NO.: 


10.1 


SEQ ID NO: 1 30/SEQ ID NO: 1 3 1 




407 bp/SEQ ID NO: 132 


10.2 


SEQ ID NO:133/SEQ ID NO: 134 


SEQ ID NO:135/SEQ ID NO:134 


547bp/SEQ ID NO: 136 


103 


SEQ ID NO:137/SEQ ID NO: 138 


SEQ ID NO:139/SEQ ID NO: 140 


903 bp/SEQ ID NO:141 


10.4 


SEQ ID NO:142/SEQ ID NO: 143 


SEQ ID NO:144/SEQ ID NO: 145 


503 bp/SEQ ID NO: 146 



TABLE 13 



Reaction 


Primer set 


Approx. Product Size/SEQ ID 
NO.: 


11.1 


SEQ ID NO:147/SEQ ID NO: 148 


418 bp/SEQ ID NO: 149 


11.2 


SEQ ID NO: 1 50/SEQ ID NO: 126 


197 bp/SEQ ID NO: 151 



To obtain the sequence at the 3' end of the genome, amplification utilized the 3' RACE 
System of GIBCO BRL in accordance with the manufacturer's instructions as described 
Example 3. cDNA was generated using SEQ ID NO:84. PCR1 utilized primers SEQ ID 
NO:150 and SEQ ID NO:85. PCR2 primers were SEQ ID NO:152 and SEQ ID NO:85 
5 (reaction 12.1). The resulting product was 901 bp (SEQ ID NO: 1 53). 

The isolation of new sequences located at the S'-terminus of the HEV US-2 viral 
genome was achieved by inverse PCR (M. Zeiner and U. Gehring, Biotechniques 17: 1051- 
1053, 1994). Due to limited availability of sera from USP-1 and USP-2, fecal material from a 
HEV US-2 infected macaque (described in Example 9 below) was chosen as the source 

10 material A product of 462 nucleotides was amplified from macaque fecal material from within 
the hypervariable/ proline rich hinge region using RNA extracted, reverse transcribed, and PCR 
amplified as described in Example 3 using primers SEQ ID NOS:154, 155, 156 and 157. This 
product (SEQ ID NO: 158) was 100% identical to HEV US-2 sequences. Therefore, it is 
contemplated that, any sequences identified at the 5' end of the HEV genome from macaque 

15 feces should accurately represent the 5' end of the HEV US-2 genome. Total nucleic acids were 
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extracted from 200 pL of a 10% fecal suspension as described above. Reverse transcription 
reactions, which utilized HEV US specific primers (SEQ ID NO: 159), were performed using a 
kit obtained from BMB (as described in M. Zeiner and U. Gehring, Biotechniques, supra), 
except that nucleic acids were denatured at 70°C for 5 min and then placed on ice prior to 
5 initiation of the RT reaction. Generation of double-stranded, circular cDNAs was performed as 
described in M. Zeiner and U. Gehring, Biotechniques, supra. The resulting circular cDNA 
molecules served as template for subsequent PCR reactions. The primers used in the first PCR 
reaction (PCR1) are shown in SEQ ID NOS:160 and 161. The nested primers used in the 
second PCR reaction (PCR 2) were as shown in SEQ ID NOS:162 and 163. 

10 Products from PCR2 (reaction 13.1) were cloned into pGEM-EasyT Vector (Promega) 

and sequenced using an Applied Biosystems 373 Automated sequencer. One product of 221 
nucleotides was identified as having the appropriate primers and HEV US-2 sequences, 
identifying 63 nucleotides upstream of known HEV US-2 sequences. Additional clones were 
identified with the appropriate primers and portions of this new sequence. Primer extension 

15 experiments performed on RNA from 100 ^iL of USP-2 serum or 100 jaL of a 10% fecal 
suspension using the sequences shown in SEQ ID NOS:163 and 161 as primers were 
unsuccessful in confirming the length of this sequence. Pair- wise comparisons of the 63 
nucleotides to 5' NTR sequences of Burmese-like isolates revealed identities greater than 94% 
suggesting that this is the true sequence of HEV US-2. 

20 The sequences obtained from the products described in this Example and those 

described in Example 4 were assembled into contigs using programs in the GCG package 
(Genetics Computer Group, Madison, WI, version 9) and a consensus sequence determined. A 
schematic of the assembled contigs is presented in Figure 4. The genome of the HEV US-2 
strain is 7277 bp in length, all of which has been sequenced and is set forth in SEQ ID NO:164. 

25 This sequence was translated into three open reading frames as indicated in SEQ ID NO: 165, 
with the translation products of the ORF 1 and ORF 2 sequences only being shown (the third 
ORF is positioned at nucleotide positions 5159-5527 but cannot be shown within SEQ ID 
NO: 165 due to overlap with the other two ORFs). The resulting translations of the ORF 1, 
ORF 2, and ORF 3 sequences are shown in SEQ ID NOS:166, 167 and 168, respectively. 
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Example 6 - Sequence Comparisons 

Information about the degree of relatedness of viruses typically can be obtained by 
performing comparisons such as alignments of nucleotide and deduced amino acid sequences. 
Alignments of the sequences of the US isolates of HEV (e.g., HEV US-1 and HEV US-2) with 
5 corresponding sequences of other isolates of HEV provide a quantitative assessment of the 
degree of similarity and identity between the sequences. In general, the calculation of the 
similarity between two amino acid sequences is based upon the degree of likeness exhibited 
-q. between the side chains of an amino acid pair in an alignment. The degree of likeness is based 

13 upon the physical-chemical characteristics of the amino acid side chains, i.e. size, shape, 

10 charge, hydrogen-bonding capacity, and chemical reactivity. Thus similar amino acids possess 
il side chains that have similar physical-chemical characteristics. The calculation of identity 

between two aligned amino acid or nucleotide sequences is, in general, an arithmetic 
s calculation that counts the number of identical pairs of amino acids or nucleotides in an 

f ;| alignment and divides this number by the length of the sequence(s) in the alignment. The 

J ^ 15 calculation of similarity between two aligned nucleotide sequences sometimes uses different 
S values for transitions and transversions between paired (i.e. matched) nucleotides at various 

positions in the alignment. However, the magnitude of the similarity and identity scores 
between pairs of nucleotide sequences, are usually very close, i.e. within one to two percent. 

The degree of similarity and identity was determined using the program GAP of the 
20 Wisconsin Sequence Analysis Package (Version 9). The gap creation and gap extension 
penalties were 50 and 3.0, respectively, for nucleic acid sequence alignments, and 12 and 4, 
respectively, for amino acid sequence comparisons. 

As indicated previously, a partial identity exists between the initial 5 '-end ORF 1 clone 
and other isolates of HEV, which supports the proposition that the HEV infection associated 
25 with patient USP-1 is due to a unique isolate of HEV. In order to more extensively determine 
the degree of relatedness between this isolate and other known isolates of HEV, alignments of 
the extended nucleotide and deduced amino acid sequences were performed. 
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Pair-wise nucleotide and amino acid comparisons of HEV US-1, HEV US-2, and 10 
other full length HEV genomes (obtained from a publicly-available database, see Table 14) 
were performed, as described above, to determine the relationship of the US isolates to each 
other and to the known variants of HEV. 

TABLE 14 



Isolate 


Genbank Accession Number 


Mexican (Ml) 


M74560 


Burmese (Bl) 


M73218 


Burmese (B2) 


D10330 


Pakistan (PI) 


M80581 


Chinese (CI) 


D11092 


Chinese (C2) 


L25547 


Chinese (C3) 


M94177 


Chinese (C4) 


D11093 


Indian (11) 


X98292 


Indian (12) 


X99441 



Nucleotide identity across the entire genomes of US-1, US-2, Bl, B2, 12, CI, C2, C3, 
PI, C4 and II strains is presented in Table 15. The nucleotide identities of ORF 1, ORF 2, and 
ORE 3 are shown in Tables 16, 17 and 18, respectively. Tables 17 and 18 also contain 
comparisons against a recently isolated swine (SI) sequence, available under GenBank 
accession number AFO 1 1 92 1 . 
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TABLE 15 - Nucleotide Identity Across Genome 





US-1 


US-2 


Bl 


B2 


12 


CI 


C2 


C3 


PI 


C4 


11 


US-2 


92.0 






















Bl 


73.9 


74.0 




















B2 


73.8 


74.0 


98.5 


















12 


73.5 


73.8 


96.1 


95.4 
















CI 


74.2 


74.3 


93.9 


93.4 


92.3 














C2 


74.2 


74.3 


93.5 


93.0 


92.0 


98.7 












C3 


74.1 


74.3 


93.7 


93.0 


92.0 


98.2 


98.7 










PI 


74.1 


74.1 


93.6 


92.8 


92.0 


98.2 


98.8 


98.3 








C4 


73.7 


73.9 


94.5 


94.1 


92.7 


97.1 


97.2 


96.8 


96.7 






11 


74.4 


74.4 


93.5 


93.0 


92.2 


93.8 


94.0 


93.8 


93.9 


93.5 




Ml 


73.7 


74.5 


75.9 


75.7 


75.0 


75.9 


75.9 


75.9 


76.1 


75.7 


75.7 



TABLE 16 - Nucleotide Identity Across ORF 1 





US-1 


US-2 


Bl 


B2 


12 


CI 


C2 


C3 


PI 


C4 


11 


US-l 
























US-2 


92.0 






















Bl 


71.7 


71.6 




















B2 


71.7 


71.8 


98.6 


















12 


71.2 


71.5 


95.7 


95.1 
















CI 


72.1 


72.1 


93.5 


93.1 


91.8 














C2 


72.2 


72.3 


93.1 


92.7 


91.5 


98.6 












C3 


71.9 


72.2 


93.3 


92.8 


91.4 


98.1 


98.7 










PI 


72.2 


72.1 


93.1 


92.6 


91.4 


98.2 


99.0 


98.4 








C4 


71.5 


71.7 


94.6 


94.4 


92.3 


96.7 


98.8 


96.3 


96.4 






11 


72.3 


72.3 


93.2 


92.8 


91.5 


93.6 


94.0 


93.7 


93.9 


93.3 




Ml 


72.0 


72.6 


73.6 


73.5 


72.5 


73.7 


73.8 


73.8 


73.9 


73.4 


73.5 
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TABLE 17 - Nucleotide Identity Across ORF 2 





US-1 


US-2 


Bl 


B2 


12 


CI 


C2 


C3 


PI 


C4 


11 


Ml 


US-1 


























US-2 


92.2 
























Bl 


79.2 


79.6 






















B2 


86.4 


79.4 


98.5 




















12 


79.0 


79.5 


99.2 


98.4 


















CI 


79.3 


79.5 


94.4 


98.4 


98.4 
















C2 


79.2 


79.4 


94.3 


97.8 


97.8 


98.9 














C3 


79.3 


79.4 


94.4 


97.8 


97.8 


98.9 


98.4 












PI 


79.0 


79.3 


93.8 


98.1 


98.7 


99.7 


99.2 


99.2 










C4 


78.8 


79.3 


94.0 


97.8 


97.8 


98.9 


98.4 


98.4 


97.4 








11 


79.4 


79.7 


94.1 


97.6 


97.3 


97.9 


97.0 


94.0 


93.7 


93.9 






Ml 


78.0 


79.3 


81.1 


90.1 


98.5 


90.6 


90.1 


81.0 


81.4 


90.3 


90.3 




SI 


92.0 


98.9 


79.8 


84.6 


85.4 


85.4 


85.1 


80.2 


80.1 


84.8 


85.1 


84.6 



TABLE 18 - Nucleotide Identity Across ORF 3 





US-1 


US-2 


Bl 


B2 


12 


CI 


C2 


C3 


PI 


C4 


11 


Ml 


US-1 


























US-2 


96.2 
























Bl 


87.0 


86.6 






















B2 


86.4 


86.3 


99.2 




















12 


86.4 


86.9 


97.8 


99.2 


















CI 


87.3 


86.3 


99.2 


98.4 


98.4 
















C2 


86.4 


86.1 


98.1 


97.3 


97.8 


98.9 














C3 


86.7 


85.6 


98.1 


97.3 


97.8 


98.9 


98.4 












PI 


87.0 


86.6 


98.9 


98.1 


98.7 


99.7 


99.2 


99.2 










C4 


86.2 


85.8 


98.1 


97.6 


97.8 


98.9 


98.4 


98.4 


99.2 








11 


86.4 


86.6 


97.8 


97.6 


97.6 


97.9 


97.0 


97.0 


97.8 


97.8 






Ml 


84.6 


85.2 


87.8 


90.1 


89.5 


90.6 


90.1 


90.1 


90.9 


90.3 


90.3 




SI 


94.9 


96.7 


85.1 


84.6 


85.4 


85.4 


85.1 


84.8 


85.6 


84.8 


85.1 


84.6 
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In addition, the ORF 1 nucleotide sequences encoding the methyltransferase proteins 
were compared between each of the US-1, US-2, Ml and PI isolates. The methyltransferase 
encoding region of the HEV US-1 genome is represented by residues 1-693 of SEQ ID NO:89, 
whereas the methyltransferase encoding region of the HEV US-2 genome is represented by 
5 residues 36-755 of SEQ ID NO:164. The comparison results are set forth in Table 19. 



TABLE 19 - Methyltransferase Region 





% IDENTITY 




US-1 


US-2 


Ml 


PI 


US-l 




93.4 


77.0 


75.2 


US-2 






78.5 


76.0 


Ml 








78.8 



The ORF 1 nucleotide sequences encoding the Y domain proteins were compared 
between each of the US-1, US-2, Ml and PI isolates. The Y domain protein encoding region 
of the HEV US-1 genome is represented by residues 619-1272 of SEQ ID NO:89, whereas the 
Y domain protein encoding region of the HEV US-2 genome is represented by residues 680- 
10 1334 of SEQ ID NO: 164. The comparison results are set forth in Table 20. 



TABLE 20 - Y Domain 





% IDENTITY 




US-1 


US-2 


Ml 


PI 


US-1 




94.0 


79.0 


77.2 


US-2 






79.7 


76.8 


Ml 








78.3 



The ORF 1 nucleotide sequences encoding the protease proteins were compared 
between each of the US-1, US-2, Ml and PI isolates. The protease protein encoding region of 
the HEV US-1 genome is represented by residues 1270-2091 of SEQ ID NO:89, whereas the 
protease protein encoding region of the HEV US-2 genome is represented by residues 1332- 
15 2153 of SEQ ID NO:164. The comparison results are set forth in Table 2L 
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TABLE 21 - Protease Region 





% IDENTITY 




US-l 


US-2 


Ml 


PI 


US-l 




91.8 


65.1 


64.0 


US-2 






65.1 


63.1 


Ml 








68.1 



The ORF 1 nucleotide sequences encoding the hypervariable region were compared 
between each of the US-l, US-2, Ml and PI isolates. The hypervariable region encoding 
region of the HEV US-l genome is represented by residues 2092-2364 of SEQ IS NO:89, 
whereas the hypervariable region encoding region of the HEV US-2 genome is represented by 
5 residues 2 1 94-2429 of SEQ ID NO: 1 64. The comparison results are set forth in Table 22. 



TABLE 22 - Hypervariable Region 





% IDENTITY 




US-l 


US-2 


Ml 


PI 


US-l 




83.9 


40.3 


50.2 


US-2 






45.8 


49.8 


Ml 








40.4 



The ORF 1 nucleotide sequences encoding the X domain proteins were compared 
between each of the US-l, US-2, Ml and PI isolates. The X domain protein encoding region 
of the HEV US-l genomes represented by residues 2365-2841 of SEQ ID NO:89, whereas the 
X domain probe encoding region of the HEV US-2 genome is represented by residues 2430- 
10 2906 of SEQ ID NO:164. The comparison results are set forth in Table 23. 



TABLE 23 - X Domain 





% IDENTITY 




US-l 


US-2 


Ml 


PI 


US-l 




91.6 


72.5 


71.3 


US-2 






72.7 


70.9 


Ml 








72.9 
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The ORF 1 nucleotide sequences encoding the helicase proteins were compared 
between each of the US-1, US-2, Ml and PI isolates. The helicase encoding region of the HEV 
US-1 genomes represented by residues 2893-3591 of SEQ ID NO:89, whereas the helicase 
encoding region of the HEV US-2 genome is represented by residues 2958-3656 of SEQ ID 
5 NO: 164. The comparison results are set forth in Table 24. 



TABLE 24 - Helicase Region 





% IDENTITY 




US-1 


US-2 


Ml 


PI 


US-1 




92.8 


76.5 


75.2 


US-2 






75.4 


74.1 


Ml 








76.2 



The ORF 1 nucleotide sequences encoding the RNA-dependent RNA polymerase 
proteins were compared between each of the US-1, US-2, Ml and PI isolates. The polymerase 
encoding region of the HEV US-1 genome is represented by residues 3634-5094 of SEQ ID 
NO: 89, whereas the polymerase encoding region of the HEV US-2 genome is represented by 
10 residues 3699-5 1 59 of SEQ ID NO: 1 64. The comparison results are set forth in Table 25. 



TABLE 25 - RNA-dependent RNA Polymerase Region 





% IDENTITY 




US-1 


US-2 


Ml 


PI 


US-1 




93.1 


72.9 


75.3 


US-2 






73.6 


75.8 


Ml 








77.1 



In addition, the amino acid identities/similarities of the proteins encoded by the ORF 1, 
ORF 2, and ORF 3 sequences of US-1, US-2, Bl, B2, 12, CI, C2, C3, PI, C4 and II strains are 
shown in Tables 26, 27 and 28 respectively. In addition, Tables 27 and 28 also contain 
comparisons against the swine sequence (SI). In Tables 26, 27 and 28, the similarities are 
15 presented in the upper right hand halves of the tables and the identities are presented in the 
lower left hand halves of the tables. 
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TABLE 26 - Amino Acid Similarity/Identity Across ORF 1 





% SIMILARITY 


% 




US-1 


US-2 


Bl 


B2 


12 


CI 


C2 


C3 


PI 


C4 


11 


Ml 




US-1 




97.8 


86.0 


85.7 


84.4 


85.9 


86.2 


84.9 


86.4 


85.7 


86.3 


85.4 


I 


US-2 


97.5 




86.2 


85.8 


84.5 


85.8 


86.0 


85.0 


86.3 


85.7 


86.3 


85.5 


D 


Bl 


82.4 


82.6 




98.7 


96.8 


98.4 


98.5 


97.1 


98.5 


98.1 


98.2 


87.0 


E 


B2 


82.3 


82.3 


98.6 




96.2 


97.8 


97.9 


96.3 


97.8 


97.6 


97.6 


86.6 


N 


12 


80.7 


80.7 


96.3 


95.7 




96.3 


96.4 


95.0 


96.3 


95.9 


95.9 


85.2 


T 


CI 


82.5 


82.3 


98.2 


97.5 


95.7 




99.5 


97.9 


99.4 


99.0 


98.2 


86.9 


I 


C2 


82.8 


82.6 


98.4 


97.8 


95.9 


99.4 




98.2 


99.6 


99.2 


98.4 


87.0 


T 


C3 


81.6 


81.6 


96.9 


96.1 


94.4 


97.7 


98.1 




98.1 


97.6 


97.0 


85.9 


Y 


PI 


83.0 


82.9 


98.4 


97.7 


95.9 


99.2 


99.6 


98.0 




99.0 


98.4 


87.1 




C4 


82.5 


82.3 


98.0 


97.6 


95.4 


98.8 


99.1 


97.4 


98.9 




97.8 


86.5 




11 


82.9 


82.9 


98.1 


97.5 


95.5 


98.1 


98.4 


96.9 


98.4 


97.8 




87.3 




Ml 


82.0 


82.0 


83.8 


83.4 


81.8 


83.7 


83.9 


82.8 


84.0 


83.4 


84.2 





TABLE 27 - Amino Acid Similarity/Identity Across ORF 2 





% SIMILARITY 


% 




US-1 


US-2 


Bl 


B2 


12 


CI 


C2 


C3 


PI 


C4 


11 


Ml 


SI 




US-1 




98.3 


93.3 


93.0 


93.0 


93.5 


93.2 


92.9 


93.2 


92.4 


92.6 


91.5 


97.1 


I 


US-2 


98.0 




93.3 


93.0 


93.3 


93.3 


93.3 


93.0 


93.3 


92.6 


92.7 


91.7 


99.1 


D 


Bl 


91.8 


91.8 




98.9 


99.1 


99.8 


99.2 


99.2 


99.5 


98.8 


98.9 


94.8 


93.0 


E 


B2 


91.5 


91.5 


98.9 




98.3 


99.1 


98.5 


98.5 


98.8 


98.2 


98.2 


94.1 


92.7 


N 


12 


91.5 


91.8 


99.1 


98.3 




99.2 


98.9 


98.6 


99.2 


98.5 


98.6 


94.5 


91.5 


T 


CI 


92.0 


92.0 


99.7 


98.9 


99.1 




99.4 


99.1 


99.7 


98.9 


99.1 


95.0 


93.2 


I 


C2 


91.7 


92.0 


99.1 


98.3 


98.8 


99.4 




98.8 


99.4 


98.6 


98.8 


94.7 


93.0 


T 


C3 


91.4 


91.7 


99.1 


98.3 


98.5 


99.1 


98.8 




99.1 


98.3 


98.5 


94.4 


92.7 


Y 


PI 


91.7 


92.0 


99.4 


98.6 


99.1 


99.7 


99.4 


99.1 




98.9 


99.1 


95.0 


93.0 




C4 


90.9 


91.2 


98.6 


98.0 


98.4 


98.9 


98.6 


98.3 


98.9 




98.3 


94.2 


92.3 




11 


91.1 


91.4 


98.5 


97.7 


98.2 


98.8 


98.5 


98.2 


98.8 


98.0 




94.7 


92.4 




Ml 


90.1 


90.6 


93.2 


92.4 


92.9 


93.3 


93.0 


92.9 


93.3 


92.6 


93.0 




91.2 




SI 


97.7 


98.9 


91.7 


91.4 


91.9 


91.8 


91.7 


91.4 


91.7 


90.9 


91.1 


90.2 
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TABLE 28 - Amino Acid Similarity/Identity Across ORF 3 





% SIMILARITY 






US-1 


US-2 


Bl 


B2 


12 


CI 


C2 


C3 


PI 


C4 


11 


Ml 


SI 


% 


US-1 




96.7 


85.2 


84.4 


85.2 


85.2 


83.6 


85.2 


85.2 


83.6 


85.2 


79.5 


93.5 




US-2 


96.7 




85.2 


84.4 


85.2 


85.2 


83.6 


83.6 


85.2 


83.6 


85.2 


81.1 


96.7 


I 


Bl 


84.4 


84.4 




98.4 


100.0 


100.0 


98.4 


98.4 


100.0 


98.4 


98.4 


87.0 


83.7 


D 


B2 


83.6 


83.6 


98.4 




98.4 


98.4 


96.7 


96.7 


98.4 


96.7 


96.7 


87.0 


82.9 


E 


12 


84.4 


84.4 


100.0 


98.4 




100.0 


98.4 


98.4 


100.0 


98.4 


98.4 


87.0 


83.7 


N 


CI 


84.4 


84.4 


100.0 


98.4 


100.0 




98.4 


98.4 


100.0 


98.4 


98.4 


87.0 


83.7 


T 


C2 


82.8 


82.8 


98.4 


96.7 


98.4 


98.4 




96.7 


98.4 


97.6 


96.7 


85.4 


82.1 


I 


C3 


84.4 


82.8 


98.4 


96.7 


98.4 


98.4 


96.7 




98.4 


96.7 


96.7 


85.4 


82.1 


T 


PI 


84.4 


84.4 


100.0 


98.4 


100.0 


100.0 


98.4 


98.4 




98.4 


98.4 


87.0 


83.7 


Y 


C4 


82.8 


82.8 


98.4 


96.7 


98.4 


98.4 


97.6 


96.7 


98.4 




96.7 


85.4 


82.1 




11 


84.4 


84.4 


98.4 


96.7 


98.4 


98.4 


96.7 


96.7 


98.4 


96.7 




88.6 


83.7 




Ml 


78.7 


80.3 


87.0 


87.0 


87.0 


87.0 


85.4 


85.4 


87.0 


85.4 


88.6 




79.7 




SI 


93.5 


96.7 


82.9 


82.1 


82.9 


82.9 


81.3 


81.3 


82.9 


81.3 


82.9 


78.9 





In addition, the ORF 1 amino acid sequences defining the methyltransferase proteins 
were compared between each of the US-1, US-2, Ml and PI isolates. The methyltransferase 
protein encoded by the HEV US-1 genome is represented by residues 1-231 of SEQ ID NO:91, 
whereas the methyltransferase protein encoded by the HEV US-2 genome is represented by 
residues 1-240 of SEQ ID NO: 166. The comparison results are set forth in Table 29. 

TABLE 29 - Methyltransferase Region 





% IDENTITY 


% 
s 
I 

M 
I 
L 
A 
R 
I 

T 
Y 




US-1 


US-2 


Ml 


PI 


US-1 




98.7 


91.3 


88.7 


US-2 


98.7 




91.7 


89.1 


Ml 


91.8 


92.0 




92.9 


PI 


90.0 


90.4 


91.2 





The ORF 1 amino acid sequences defining the protease proteins were compared 
between each of the US-1, US-2, Ml and PI isolates. The protease protein encoded by the 
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HEV US-1 genome is represented by residues 424-697 of SEQ ID NO:91, whereas the protease 
protein encoded by the HEV US-2 genome is represented by residues 433-706 of SEQ ID 
NO: 166. The comparison results are set forth in Table 30. 



TABLE 30 - Protease Region 





% IDENTITY 


% 
s 
I 

M 
I 
L 
A 
R 
I 

T 
Y 




US-1 


US-2 


Ml 


PI 


US-1 




98.5 


67.5 


69.3 


US-2 


97.8 




67.1 


68.6 


Ml 


73.3 


73.3 




76.6 


PI 


74.4 


74.0 


72.2 





The ORF 1 amino acid sequences defining Y domain proteins were compared between 
each of the US-1, US-2, Ml and PI isolates. The Y domain protein encoded by the HEV US-1 
genome is represented by residues 207-424 of SEQ ID NO:91, whereas the Y domain protein 
encoded by the HEV US-2 genome is represented by residues 216-433 of SEQ ID NO: 166. 
The comparison results are set forth in Table 3 1 . 
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TABLE 31 - Y Domain 





% IDENTITY 


% 
s 
I 

M 
I 

L 
A 
R 
I 

T 
Y 




US-1 


US-2 


Ml 


PI 


US-1 




98.2 


92.7 


93.6 


US-2 


98.2 




92.7 


93.6 


Ml 


94.0 


94.0 




93.1 


PI 


94.5 


94.5 


91.7 





The ORF 1 amino acid sequences defining the X domain proteins were compared 
between each of the US-1, US-2, Ml and PI isolates. The X domain encoded by the HEV US- 
1 genome is represented by residues 789-947 of SEQ ID NO:91, whereas the X domain protein 
encoded by the HEV US-2 genome is represented by residues 799-957 of SEQ ID NO:166. 
The comparison results are set forth in Table 32. 



TABLE 32 - X Domain 





% IDENTITY 


% 
s 
I 

M 

I 
L 
A 
R 

I 

T 
Y 




US-1 


US-2 


Ml 


PI 


US-1 




97.5 


82.4 


80.5 


US-2 


97.5 




81.8 


79.9 


Ml 


88.0 


87.4 




86.1 


PI 


84.3 


83.6 


83.0 





The ORF 1 amino acid sequences defining helicase proteins were compared between 
each of the US-1, US-2, Ml and PI isolates. The helicase encoded by the HEV US-1, US-2, 
Ml and PI isolates. The helicase encoded by the HEV US-1 genome is represented by residues 
965-1 197 of SEQ ID NO:91, whereas the helicase encoded by the HEV US-2 genome is 
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represented by residues 975-1207 of SEQ ID NO: 166. The comparison results are set forth in 
Table 33. 



TABLE 33 - Helicase Region 





% IDENTITY 


% 
s 




US-1 


US-2 


Ml 


PI 


I 

M 


US-1 




99.1 


89.7 


91.0 


I 
L 


US-2 


99.1 




90.6 


91.8 


A 
R 


Ml 


93.1 


94.0 




95.2 


I 

T 
Y 


PI 


94.0 


94.8 


91.0 





The ORF 1 amino acid sequence defining the hypervariable regions were compared 
between each end of the US-1, US-2, Ml and PI isolates. The hypervariable region encoded by 
the HEV US-1 genome is represented by residues 698-788 of SEQ ID NO:91, whereas the 
hypervariable region encoded by the HEV US-2 genome is represented by residues 707-798 of 
SEQ ID NO: 166. The comparison results are set forth in Table 34. 



TABLE 34 - Hypervariable Region 





% IDENTITY 


% 
s 
I 

M 
I 

L 
A 
R 
I 
T 
Y 




US-1 


US-2 


Ml 


PI 


US-1 




82.4 


25.0 


27.7 


US-2 


79.1 




25.0 


21.0 


Ml 


25.0 


25.0 




20.8 


PI 


31.9 


21.0 


18.0 
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The ORF 1 amino acid sequence defining the RNA-dependent RNA polymerase 
proteins were compared between each of the US-1, US-2, Ml and PI isolates. The polymerase 
encoded by the HEV US-1 genome is represented by residues 1212-1698 of SEQ ID NO:91, 
whereas the polymerase encoded by the HEV US-2 genome is represented by residues 1222- 
5 1 708 of SEQ ID NO: 1 66. The comparison results are set forth in Table 35. 



TABLE 35 - RNA-dependent RNA Polymerase Domain 





% IDENTITY 


% 
s 
I 

M 
I 

L 
A 
R 
I 

T 
Y 




US-1 


US-2 


Ml 


PI 


US-1 




99.0 


86.0 


87.8 


US-2 


99.0 




86.2 


87.7 


Ml 


89.7 


89.9 




92.6 


PI 


91.6 


91.6 


89.5 





In addition to the foregoing, several additional HEV isolates belonging to the HEV US- 
type family were identified during the course of this work (see, Example 13 below). The 
additional isolates are denoted as Itl (Italian strain), Gl (first Greek strain) and G2 (second 

10 Greek strain). Additional sequence comparisons were performed and include the Itl ? Gl and 
G2 sequences, the results of which are presented below in Tables 36 and 37. Table 36 shows 
the nucleotide and deduced amino acid identities between isolates of HEV over a 371 base (123 
amino acids) ORF 1 fragment. The ORF 1 fragment corresponds to residues 26-396 of SEQ ID 
NO:89. Table 37 shows the nucleotide and deduced amino acid identities between isolates of 

15 HEV over a 148 base (49 amino acid) ORF 2 fragment. The ORF 2 fragment corresponds to 
residues 6307-6454 of SEQ ID NO:89. In both Tables 36 and 37, the isolates represented are 
Burmese (Bl, B2), Chinese (CI, C2, C3, C4), Indian (II, 12), Pakistan (PI), Mexican (Ml), 
Swine (SI), United States (US-1, US-2), Greek (Gl, G2) and Italian (Itl). 



Pairwise comparisons of the full length nucleotide sequences were preferred using the 
nucleotide sequences of the respective genomes of HEV US-1 and HEV US-2 together with the 
other genomes of the other HEV isolates identified in Table 14. The results of the comparison 
are shown in Table 15. At the nucleotide level, HEV US-1 and HEV US-2 were most closely 
5 related to each other, with 92.0% identity across the entire genome. The full length Burmese- 
like isolates demonstrated similar identities ranging from 92.0 to 98.8%. The US isolates were 
73.5 to 74.5% identical to the Burmese-like and Mexican isolates. This is similar to the identity 
seen between any one Burmese-like isolate and the Mexican isolate, 75.0 to 76.1% nucleotide 
identity. These data indicate that the US isolates are members of a new strain variant of HEV, 
10 distinct from the Burmese and Mexican strains. 

Similar degrees of identity are found when smaller portions of each genome are 
analyzed, such as the individual ORFs. These values are presented in Tables 16, 17 and 18 for 
ORF 1, ORF 2, and ORF 3, respectively. Across each region, the Burmese and Pakistani 
isolates demonstrate the highest degree of identity ranging from 93.1 to 98.9% identity. The 
15 Mexican isolate is distinct, with identities of 73.6 to 90.1% to the Burmese-like isolates. HEV 
US-1 nucleotide sequence analysis reveals a significant degree of divergence with ORF 1 
sequences being less than 72% identical to the Burmese-like and Mexican isolates. Similarly, 
ORF 2 and ORF 3 sequences were less than 79.1% and 86.9% identical to the Burmese-like and 
Mexican isolates, respectively. 

20 The variability seen at the nucleotide level is reflected in the amino acid similarity and 

identity of the translated open reading frames. ORF 1 is the most divergent product, potentially 
due to the presence of a hypervariable region. The US isolates possess 97.5% amino acid 
identity across this region (Table 26). This is similar to the 94.4 to 99.6% identity seen 
between Burmese-like ORF 1 proteins. The US ORF 1 products are 80.7 to 83.0% identical to 

25 Burmese-like and Mexican proteins (Table 26). These values are similar to those observed 
between any one Burmese-like isolates and the Mexican isolate, ranging from 81.8 to 84.2% 
identity. Amino acid similarity values are generally up to 3.5% higher than the identity value, 
reflecting a large number of conservative amino acid substitutions. The ORF 2 product is the 
most conserved, potentially due to its role as the viral capsid protein. The US ORF 2 products 
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are 98.0% identical to each other, while being 90.1 to 92% identical to Burmese and Mexican 
ORF 2 proteins (Table 27). Again, these ranges mirror those observed between Burmese 
isolates (97.7 to 99.7% identity). Identity between Burmese and Mexican isolates is slightly 
greater than that between the US variant and other variants, being 92.4 to 93.3%. Amino acid 
5 similarity across ORF 2 adds approximately 1 .5% to the identity value. The ORF 3 product of 
HEV US-1 and HEV US-2 shared 96.7% amino acid identity. The Burmese isolates showed 
96.7 to 100% amino acid identity. ORF 3 amino acid identities of the US isolates to the 
Burmese and Mexican isolates were 78.7 to 84.4%, slightly less than that observed between 
Burmese and Mexican isolates, 85.4 to 88.6% identity (Table 28). Amino acid similarity across 
10 ORF 3 was generally the same as the identity values, however, some comparisons demonstrated 
similarity values less than 1 .0% greater than the identity value. These amino acid similarity 
and identity values indicate that the analysis of short amino acid sequences produce similar 
results to full length and partial nucleotide analyses, indicating that the US isolates are closely 
related and genetically distinct from previously characterized isolates of HEV. 

15 Tables 27 and 28 also include pairwise amino acid sequence comparisons with a HEV- 

- like isolate recently identified in swine (Meng et at (1997) Proc. Natl. Acad. Sci. USA 94: 
9860-9865. Only 2021 bp across the ORF 2/3 region have been characterized (GenBank 
Accession Number: AF01 1921). The US swine sequence is 92% identical to the corresponding 
region of HEV US-1 at the nucleotide level. It is noted that HEV US-1 is very similar at the 

20 amino acid level to the recently identified swine virus. For example, the HEV US-1 and swine 
strains exhibit 97.1% and 93.5% identity over the respective ORF 2 and ORF 3 sequences 
(Tables 27 and 28, respectively). 

Partial sequences of 210 nucleotides from two HEV isolates from China referred to as 
G9 and G20 (Genbank Accession numbers X87306 and X87307, respectively) recently have 
25 been described in the literature by (Huang et ah (1995) J. Med Virology 47: 303-308). These 
fragments represent nucleotide sequences homologous to residue numbers 4533 to 4742 of SEQ 
ID NO: 89. Their encoded amino acid sequences (69 amino acid residues in length) are 
homologous to residue numbers 1512-1580 of SEQ ID NO:91. The results from the pairwise 
comparisons of the nucleotide sequences and the predicted amino acid sequences of these 
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sequences are shown in Tables 38 and 39. Results indicate that the G9 and G20 isolates are 
89% identical to one another at the nucleotide level across this region. The closely related 
Burmese and Pakistan isolates are 92.9% identical over this range. The US-1 isolate exhibits a 
77.1 and 81.0 across this region suggesting that the US-1 isolate also is unique from these 
5 isolates. Although the G9 and G20 sequences are most closely related at the nucleotide level, 
the deduced amino acid translation of G20 is most similar/identical to the US sequence from 
the US-1 isolate (Table 38). This is most likely due to the short length of amino acids utilized 
in the analysis. 



TABLE 38. Identity across 210 nucleotides of ORF 1 





Pak 


Mex 


US-1 


G20 


G9 


Bur 


92.9 


74.8 


75.7 


78.1 


76.7 


Pak 




75.2 


76.7 


78.1 


76.7 


Mex 






77.1 


75.2 


71.9 


US-1 








81.0 


77.1 


G20 










89.0 



TABLE 39. Similarity/identity across 69 amino acids of ORF 1 





Pak 


Mex 


US-1 


G20 


G9 


Bur 


98.6/98.6 


92.8 / 88.4 


92.8 / 85.5 


92.8 / 88.4 


82.6 / 79.7 


Pak 




94.2 / 89.9 


91.3/84.1 


91.3/87.0 


84.1/81.2 


Mex 






89.9 / 87.0 


89.9 / 87.0 


81.2/78.3 


US-1 








100/95.7 


88.4/88.1 


G20 










88.4 / 87.0 
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Example 7 - Phylo genetic Analyses. 

Alignments of nucleotide and amino acid sequences were performed in order to 
determine the phylogenetic relationships between the novel US-type isolates and other isolates 
of HEV. The alignments were made using the program PILEUP of the Wisconsin Sequence 

5 Analysis Package, version 9 (Genetics Computer Group, Madison, WI). Evolutionary distances 
between sequences were determined using the DNADIST program (Kimura 2-parameter 
method) with a transition-transversion ratio of 2.0 and PROTDIST (Dayhoff PAM matrix) 
program of the PHYLIP package, version 3.5c (Felsenstein 1993, Department of Genetics, 
University of Washington, Seattle). The computed distances were used for the construction of 

10 phylogenetic trees using the program FITCH (Fitch-Margoliash method). The robustness of the 
trees was determined by bootstrap resampling of the multiple- sequence alignments (100 sets or 
1,000 sets) with the programs SEQBOOT, DNADIST, the neighbor-joining method of the 
program NEIGHBOR, and CONSENSE (PHYLIP package). Bootstrap values of less than 70% 
are regarded as not providing evidence for a phylogenetic grouping (Muerhoff et aL 7 (1997) 

15 Journal of Virology, 71: 6501-6508). The final trees were produced using RETREE (PHYLIP) 
with the midpoint rooting option and the graphical output was created with TREEVIEW (Page, 
(1996) Computer Applied Biosciences 12: 357-358), the results of which are presented in 
Figures 5, 6, 10, and 11. 

Phylogenetic analysis with complete genomes. To more extensively determine the degree 
20 of relatedness between HEV US-1, HEV US-2, and other known isolates of HEV, nucleotide 
alignments were performed. The full length HEV US-1 and HEV US-2 genomes were aligned 
with 10 other isolates of HEV from which complete genomes are available (Table 14). 

Examination of the phylogenetic distances based upon alignments of the HEV-US isolates 
and other isolates of HEV demonstrate that there is considerable evolutionary distance between 
25 those from the US and those from other geographical areas as determined using the DNADIST 
program (Kimura 2-parameter method) with a transition-transversion ratio of 2.0 (Table 40). 
The distances calculated also show the close relationship between the isolates originating from 
Asia. Within this Burmese-like group the maximum distance calculated from the full length 
alignment is 0.0850 nucleotide substitutions per base. The minimum distance between a 
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member of this group and a US isolate is 03322 substitutions. The Mexican strain shows 
similar distances to the Burmese-like group of 0.3055 to 0.3132 substitutions and 03322 to 
03462 substitutions to the US isolate. The genetic distance between HEV US-1 and HEV US- 
2 of 0.0812 substitutions is similar to that seen between Burmese-like isolates. The relative 

5 evolutionary distances between the viral sequences analyzed are readily apparent upon 

inspection of the unrooted phylogenetic tree presented in Figure 5, where the branch lengths are 
proportional to the evolutionary distances. In the phylogenetic tree, the Burmese-like isolates, 
the Mexican isolate and the US isolates each represent a major branch. In addition, the 
branching of the prototype viruses are supported with bootstrap values of 100%. Analysis of 

10 smaller segments of the genome (e.g. ORF 1, ORF 2, or ORF 3) were individually analyzed 

resulting in trees analogous to those obtained with the full length sequence and shown in Figure 
5. These analyses demonstrate that the HEV US isolates represent a distinct strain or variant of 
HEV and that HEV US-1 and HEV US-2 are as similar to each other as are the most divergent 
Burmese-like isolates. 
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TABLE 40 - Phylogenetic distances over the full length sequence 





R1 


R? 

DZ 


CI 


C? 




C4 


11 


12 


PI 


Ml 


US-1 


D 1 


























0 01 4Q 






















\^ 1 


























U.UOoU 


yJ.V i j J 


\J.\J IjO 
























0.0178 


0.0132 
















C4 


0.0574 


0.0611 


0.0304 


0.0290 


0.0329 














11 


0.0677 


0.0728 


0.0645 


0.0625 


0.0647 


0.0681 












12 


0.0403 


0.0477 


0.0820 


0.0849 


0.0846 


0.0776 


0.0832 










PI 


0.0693 


0.0751 


0.0178 


0.0120 


0.0172 


0.0335 


0.0633 


0.0850 








Ml 


0.3096 


0.3120 


0.3086 


0.3089 


0.3091 


0.3132 


0.3120 


0.3259 


0.3055 






US-1 


0.3406 


0.3418 


0.3360 


0.3345 


0.3367 


0.3445 


0.3322 


0.3464 


0.3363 


0.3462 




US-2 


0.3413 


0.3408 


0.3370 


0.3361 


0.3374 


0.3445 


0.3333 


0.3461 


0.3377 


0.3367 


0.0812 



Comparison to ORF 2/QRF 3 from Swine HEV. In order to determine the relationship 
between a recently described swine-HEV and the human HEV US-1 and HEV US-2 isolates, 
comparisons of the nucleotide sequences across the complete ORF 2 and ORF 3 were 
performed using analogous regions from the 10 full length sequences utilized above (Table 14). 

5 Phylogenetic analysis produces genetic distances of 0.0799 to 0.0810 nucleotide substitutions 
per position between the US and swine HEV isolates (Table 41). These values are similar to 
those observed between the most distant Burmese-like isolates. The US and swine isolates 
group closely on an unrooted phylogenetic tree when the ORF 2/3 nucleotide sequences are 
analyzed (See, Figure 6). These isolates form a phylogenetic group distinct from the Mexican 

10 isolate and the Burmese-like isolates. These grouping are supported by bootstrap values of 
100%. 
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TABLE 41 - Phylogenetic distances between USswine and human HEV isolates 





US-2 


USswine 


Burmese 


Mexican 


US-l 


0.0799 


0.0810 


0.2441-0.2495 


0.2671 


US-2 




0.0795 


0.2409-0.2479 


0.2486 


USswine 






0.2348-0.2485 


0.2615 


Burmese 






0.0119-0.0716 


0.2183-0.2248 



Example 8 - HEV Serologic Studies 
A. Background 

Early studies indicate that epitopes useful for diagnosis of HEV infections are located 
near the carboxyl terminus of ORE 2 and ORF 3 of both the Burmese and Mexican strains of 

5 HEV. The two antigens from the Mexican strain, referred to hereinafter as M 3-2 and M 4-2, 
comprise 42 and 32 amino acids near the carboxyl terminus of ORF 2 and ORF 3, respectively 
(Yarbough et al (1991) Journal of Virology, 65: 5790-5797). The two antigens from the 
Burmese strain of HEV, referred to hereinafter as B 3-2 and B 4-2 proteins, comprise 42 and 33 
amino acids near the carboxyl terminus of ORF 2 and ORF 3, respectively (Yarbough et al. 

10 (1991) supra). Diagnostic tests designed to detect IgG, IgA and IgM class antibodies to HEV 
have been developed based on these antigenic regions. Additional HEV recombinant proteins 
have been generated that encompass full-length ORF 3 (Dawson et al (1992) Journal of 
Virology Methods, 38: 175-186) or additional amino acid sequences from the ORF 2 protein 
(Dawson et al (1993) supra), to potentially enhance the detection of antibodies to HEV. 

15 Comparative studies indicate that the original recombinant proteins and synthetic peptides (B4- 
2, B3-2, M3-2, M4-2) were as effective as the larger recombinant proteins in detecting 
antibodies to HEV in known cases of acute HEV infection. A licensed test to detect antibodies 
to HEV is manufactured by Abbott Laboratories and consists of the full length Burmese strain 
ORF 3 protein and the carboxyl 327 amino acids of the Burmese strain ORF 2 protein. 



20 



After initial serological studies demonstrating the utility of B 3-2, B 4-2, M 3-2 and M 
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4-2, it was established that six additional amino acids reside at the carboxyl terminus of ORF 2 
of both the Burmese and Mexican strains of HEV which do not form part of the M 3-2 and B 
3-2 antigenic peptides. Since the carboxyl ends of ORF 2 and ORF 3 have been shown to be of 
value for the Burmese and Mexican strains of HEV, synthetic peptides corresponding to the 

5 these regions of the genome were generated for the US-1 strain of HEV. The synthetic peptides 
corresponding to the 48 amino acids at the carboxyl end of the ORF 2 were generated for the 
Burmese and Mexican strains of HEV (SEQ ID NOS:172 and 170, respectively), and are 
referred to as B 3-2e and M 3-2e (where "e" designates extended amino acid sequence). In 
addition, synthetic peptides representing the 33 amino acids at the carboxyl end of the HEV 

10 US-1 ORF 3 were generated for the Burmese and Mexican strains of HEV (SEQ ID NOS:171 
and 169, respectively), and are referred to as B4-2 and M4-2. The synthetic peptide based on 
the epitope from within ORF 2 for the HEV US-1 strain (SEQ ID NO:174) is referred to as the 
US 3-2e. The synthetic peptide based on the epitope at the carboxyl end of the HEV US-1 ORF 
3 (SEQ ID NO: 173) is referred to as US 4-2. Each of these peptides derived from the Mexican, 

15 Burmese and US strains of HEV were synthesized, coated on a solid phase and utilized in 
ELISA tests to determine the relative usefulness of these synthetic peptides. 

As noted in Table 42, the amino acid identity between HEV US-1 and the Burmese, 
Mexican, and Pakistani strains of HEV range from about 87.5% to about 91.7% for the amino 
acids comprising the 3-2e epitopes within ORF 2, and from about 63.6 to about 72.7% for the 
20 amino acids comprising the 4-2 epitopes within ORF 3. Without wishing to be bound by 

theory, given the degree of variability in the regions encoding for epitopes, it is likely that there 
may be strain specific antibody responses to theses viruses. 
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TABLE 42 - (Similarity/Identify) 





3-2e Peptide 


4-2 Peptide 


Pak 


Mex 


US-1 


Pak 


Mex 


US-1 


Bur 


100/97.9 


91.7/91.7 


93.7/91.7 


100/100 


72.7/72.7 


72.7 / 72.7 


Pak 




91.7/91.7 


93.7/91.7 




72.7 / 72.7 


72.7 / 72.7 


Mex 






89.6 / 87.5 






63.6 / 63.6 



B. Use of ELISA's in diagnosing acute HEV infection 

It has been reported that most cases of acute HEV infection in man are accompanied by 
IgM class antibodies which bind to one or more HEV recombinant proteins or synthetic 
peptides. If a person does not have IgM class antibodies to HEV, the basis for diagnosis of 
5 acute HEV infection cannot be made on serology alone but may require, RT-PCR and/or other 
tests to verify HEV as the etiologic agent. 

C. Generation of Synthetic Peptides 

Peptides were prepared on a Rainin Symphony Multiple Peptide Synthesizer using 
standard FMOC solid phase peptide synthesis on a 0.025 jumole scale with (HBTU) coupling 
10 chemistry by in situ activation provided by N-methyl-morpholine, with 45 minute coupling 

times at each residue, and double coupling at predetermined residues. Standard cleavage of the 
resin provided the unprotected peptide, followed by ether precipitation and washing. The 
peptides synthesized are shown in Table 43. 
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TABLE 43 



Peptide 


Sequence 


SEQ ID NO: 


B3-2e 


TLDYPARAHTFDDFCPECRPLGLQGCAFQSTVAELQRLKMKVGKTREL 


SEQ ID NO: 172 


B4-2 


ANPPDHSAPLGVTRPSAPPLPHVVDLPQLGPRR 


SEQ ID NO: 171 


M3-2e 


TFDYPGRAHTFDDFCPECRALGLQGCAFQSTVAELQRLKVKVGKTREL 


SEQ ID NO: 170 


M4-2 


ANQPGHLAPLGEIRPSAPPLPPVADLPQPGLRR 


SEQ ID NO: 169 


US 3-2e 


TVDYPARAHTFDDFCPECRTLGVQGCAFQSTIAEVQRLKMKVGKTREV 


SEQ ID NO: 174 


US 4-2 


DSRPAPSVPLGVTSPSAPPLPPVVDLPQLGLRC 


SEQ ID NO: 173 



D. Analysis of Synthesized Peptides 

The synthesized peptides were analyzed for their amino acid composition as follows. 
The crude peptides from the small scale syntheses (0.025 |imole) were analyzed for their 

5 quality by C 1 8 reverse phase high pressure liquid chromatography using an acetonitrile/water 
gradient with 0.1% (v/v) 2 trifluoracetic acid (TFA) in each solvent. From the analytical 
chromatogram, the major peak from each synthesis was collected and the effluent analyzed by 
mass spectrometry (electrospray and/or laser desorption mass spectrometry. Purification of the 
peptides (small and/or large scale) was achieved using CI 8 reverse phase HPLC with an 

10 acetonitrile/water gradient with 0. 1 % TFA in each solvent. The major peak was collected, and 
lyophilized until use. 

E. ELISA Test 

The utility of the HEV US-1 epitopes was determined by coating 1/4 inch polystyrene 
beads with each peptide. Specifically, the peptides were solubilized in water or water plus 

15 glacial acetic acid and diluted to contain 10 jag/mL in phosphate buffei (pH 7.4). A total of 60 
polystyrene beads were added to a scintillation vial along with 14 mL of peptide solution (10 
^ig/mL) and incubated at 56°C for two hours phosphate buffered saline (PBS). After 
incubation, the liquid was aspirated and replaced with a buffer containing 0.1% Triton-XlOO®. 
The beads were exposed to this solution for 60 minutes, the fluid aspirated and the beads 

20 washed twice with PBS buffer. The beads then were incubated with 5% bovine serum albumin 
solution for 60 minutes at 40°C. After incubation, the fluid was aspirated and the beads rinsed 
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with PBS. The resulting beads were soaked in PBS containing 5% sucrose for 30 minutes. The 
fluids then were aspirated and the beads air-dried. 

In one study, one-quarter inch polystyrene beads were coated with various 
concentrations of the synthetic peptide (approximately 50 beads per lot) and evaluated in an 
5 ELISA test (described below) using serum from an anti-HEV seronegative human as a negative 
control and convalescent sera from an HEV-infected person as a positive control. The bead 
coating conditions providing the highest ratio of positive control signal to negative control 
signal were selected for scaling up the bead coating process. Two 1,000 bead lots were 
produced for both HEV US-1 ORF 2 and ORF 3 epitopes and then used as follows. 

10 A sample of sera or plasma was diluted in specimen diluent and mixed with antigen- 

coated solid phase under conditions that permit an antibody in the sample to bind to the 
immobilized antigen. After washing, the resulting beads were mixed with horseradish 
peroxidase (HRPO)-labeled anti-human antibodies that bind to either tamarin or human 
antibodies bound to the solid phase. Specimens which produced signals above a cutoff value 

1 5 were considered reactive . 

More specifically, the preferred ELISA format requires contacting the antigen-coated 
solid phase with serum pre-diluted with specimen diluent (buffered solution containing animal 
sera and non-ionic detergents). Specifically, 10 jiL of serum was diluted in 150 |uL of 
specimen diluent and vortexed. Then 1 0 (il of this pre-diluted specimen was added to each well 

20 of an ELISA plate, followed by the addition of 200 (iL of specimen diluent and an antigen 
coated polystyrene beads. The ELISA plate then was incubated in a Dynamic Incubator 
(Abbott Laboratories) with constant agitation at room temperature for 1 hour. After the 
incubation, the fluids were aspirated, and the wells washed three times in distilled water (5 mL 
per wash). Next, 200 |nL of HRPO-labeled goat anti-human immunoglobulin diluted in a 

25 conjugate diluent (buffered solution containing animal sera and non-ionic detergents) was 
added to each well and the ELISA plate incubated for 1 hour, as indicated above. The wells 
then were washed three times in distilled water, the beads containing antigen and bound 
immunoglobulins removed from each well, and then placed in a test tube with 300 (iL of a 
solution of 0.1 M citrate buffer (pH 5.5), 0.3% o-phenylenediamine-2 HC1 and 0.02% hydrogen 
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peroxide. After 30 minutes at room temperature, the reaction was terminated by the addition of 
1 N sulphuric acid. The resulting absorbance at 492 nm was the recorded. The intensity of the 
color produced was directly proportional to the amount of antibody present in the test sample. 
For each group of specimens, a preliminary cutoff value was set to separate specimens which 
5 presumably contained antibodies to the HEV epitope from those specimens which did not. 

Panel 1 : Testing of pre-screened panels 

In order to demonstrate the utility of epitopes derived from the HEV US-1 strain, a 
panel of specimens was tested by an ELISA based on the HEV US-1 amino acid sequences 
(Table 44 These samples had been pre-screened for antibodies to HEV, using a combination of 
10 existing peptides and a licensed anti-HEV (Abbott Laboratories) as described above and in 
published reports (Dawson et ah (1993) supra; Paul et ah (1993) supra). 

The first 10 members of the panel consisted of specimens obtained from US volunteer 
blood donors whose sera was negative for antibodies to HEV following analysis using a 
combination of peptides and recombinant proteins derived from Burmese and Mexican strains 

15 of HEV. All the specimens were non-reactive with ELISA's derived from HEV US-1 . Five 
additional specimens were obtained from individuals suffering from acute hepatitis, and who 
were diagnosed with acute HEV infection because their sera was reactive for both IgG and IgM 
class antibodies to HEV recombinant antigens and synthetic peptides based on the Burmese and 
Mexican strains of HEV. Three of the five samples were from Egypt, one from India and one 

20 from Norway (a traveler). HEV RNA was detected by RT-PCR in all five of these individuals. 
These five members were tested for antibodies to the HEV US-1 isolate and both IgG and IgM 
class antibodies were detected in each of the cases (Table 44). Thus, these data support the use 
of synthetic peptides from the US-1 strain of HEV as having utility in diagnosing exposure to 
HEV and for diagnosing acute HEV infections. 
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TABLE 44 



Test 

Specimens 


Licensed anti HEV 


US Isolate 


IgG 


IgM 


Tested 


IgG 


IgM 


4-2 


3-2e 


4-2 


3-2e 


Neg. Control 


0.061 


0.084 


0.031 


0.041 


0.071 


0.109 


Pos. Control 


0.567 


1.051 


1.606 


1.619 


1.376 


1.798 


US 

Volunteer 
Donors 














TG 827 


_ 


_ 


_ 


_ 


_ 


_ 


EG 549 


_ 




_ 


- 


_ 


_ 


EC 760 


_ 


_ 


_ 


_ 


_ 




RF762 


_ 




_ 


_ 




- 


RF762 


_ 


_ 


_ 


_ 


- 


- 


RG 730 


_ 


_ 


- 


_ 


- 


- 


NH770 


- 


- 


- 


- 


- 


- 


AS 705 














BW 494 














CD 648 




























Egypt 














7 




+ 


+ 


+ 


+ 


+ 


9 


+ 


+ 


+ 


+ 


+ 


+ 


12 


+ 


+ 


+ 




+ 


+ 
















India 


+ 


+ 


+ 


+ 


+ 


+ 


543 




























Norway 














Ml 


+ 


+ 


+ 


+ 


+ 


+ 



Panel 2: Detection of antibodies to HEV in biological source of HEV US-1 isolate 



Serial bleeds were obtained form the patient described in Example 1, whose serum 
served as the biological source for the HEV US-1 strain. Based on serological data obtained for 
the Burmese and Mexican strains of HEV, this patient would have been misdiagnosed as HEV 
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negative because of the lack of detectable IgM class antibodies to HEV. However, both IgM 
class (Table 45) and IgG class (Table 46) antibodies to the HEV US-1 strain were detected on 
all four bleed dates (Tables 45 and 46. Had this patient's sera been analyzed for the presence of 
IgG and IgM class antibodies to the HEV US 3-2e and US 4-2 peptides, a positive diagnosis of 
acute HEV infection would have been made. This diagnosis is further supported by the 
observation that the individual had acute hepatitis and most importantly, had detectable HEV 
US-1 strain RNA in serum samples. These data indicate that synthetic peptides derived form 
the HEV US-1 strain may be useful in more accurately diagnosing acute infection due to HEV. 



TABLE 45 





IgM: ORF 3 synthetic peptide 4-2 


IgM: ORF 2 synthetic peptide 3-2e 


Specimens 


ISOLATES 




ISOLATES 




Tested 


Burmese 


Mexican 


US-1 


Burmese 


Mexican 


US-1 


Negative Control 


0.059 


0.081 


0.031 


0.142 


0.065 


0.109 


Positive Control 


0.854 


0.985 


1.363 


1.309 


0.579 


1.798 
















USP-1 














8 days post admission 






+ 






+ 


9 days post admission 






+ 






+ 


10 days post admission 






+ 






+ 


37 days post admission 






+ 






+ 



TABLE 46 



Specimens 


IgG: ORF 3 synthetic peptide 4-2 
ISOLATES 


IgG: ORF 2 synthetic peptide 3-2e 
ISOLATES 


Tested 


Burmese 


Mexican 


US-1 


Burmese 


Mexican 


US-1 
















Negative Control 


0.039 


0.055 


0.031 


0.034 


0.057 


0.041 


Positive Control 


1.296 


0.666 


0.941 


1.322 


0.893 


1.041 
















USP-1 






+ 






+ 


8 days post admission 






+ 






+ 


9 days post admission 






+ 






+ 


10 days post admission 






+ 








37 days post admission 






+ 
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Panel 3 - Other cases of potential acute HEV infection 

A panel of sera from 50 patients diagnosed with acute hepatitis who were negative for 
IgM class antibodies to the Burmese and Mexican strains was assembled. Ten of 50 sera 
samples were positive for antibodies to the US strain of HEV (Tables 47 and 48). RT-PCR was 
performed on these samples, but none of the 1 0 were positive for HEV RNA. Thus, as 
demonstrated in this example, when patient sera is analyzed for the presence of antibodies to 
HEV US-1, occult viral hepatitis may be diagnosed as acute HEV infection. 



TABLE 47 



Specimens 


IgM: ORF 3 synthetic peptide 4-2 
ISOLATES 


IgM: ORF 2 synthetic peptide 3-2e 
ISOLATES 


Tested 


Burmese 


Mexican 


US-1 


Burmese 


Mexican 


US-1 
















Negative Control 


0.059 


0.081 


0.031 


0.142 


0.065 


0.109 


Positive Control 


0.854 


0.985 


1.363 


1.309 


0.579 


1.798 
















US 












+ 


Acute non A-E 












+ 


SH 755 












+ 


DT314 












+ 


EH 673 












+ 


SG560 












+ 


SR681 














N11C10 






+ 






+ 


35 






+ 






+ 


52 












+ 


161 












+ 


175 
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TABLE 48 



Specimens 


IgG: ORF 3 synthetic peptide 4-2 
ISOLATES 


IgG: ORF 2 synthetic peptide 3-2e 
ISOLATES 


Tested 


Burmese 


Mexican 


US-1 


Burmese 


Mexican 


US-1 
















Negative Control 


0.039 


0.055 


0.031 


0.034 


0.057 


0.041 


Positive Control 


1.296 


0.666 


0.941 


1.322 


0.893 


1.041 
















US 














Acute non A-E 














SH 755 














DT314 














EH 673 














SG560 














SR681 












+ 


N11C10 














35 












+ 


52 














161 














175 















Example 9 - Animal Transmission Studies 

Cynomolgus macaques (Macaca fascicularis) were obtained through the Southwest 
Foundation for Biomedical Research (SFBR) in San Antonio, Texas. The animals were 
maintained and monitored in accordance with guidelines established by SFBR to ensure 

5 humane care and the ethical use of primates. Sera were obtained twice weekly for at least four 
weeks prior to inoculation in order to establish the baseline levels for serum ALT. Cut-off 
(CO) values were determined based on the mean of the baseline plus 3.75 times the standard 
deviation. Two macaques were inoculated intravenously with 0.4-0.625 mL of HEV positive 
USP-1 serum and one macaque was inoculated with 2.0 mL of HEV positive USP-2 serum. 

10 Serum and fecal samples were collected twice weekly for up to 16 weeks post-inoculation (PI). 
Sera were tested for changes in ALT and values greater than the CO were considered positive 
and suggestive of liver damage. Sera samples were tested for antibodies to HEV as described 
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hereinabove in Example 8 (Table 49, Figure 7). Sera and fecal samples were tested for HEV 
RNA by RT-PCR. 25-100 \iL of macaque sera was extracted using the QIAamp Viral RNA Kit 
(Qiagen). 10% fecal suspension were extracted as described in Example 1 . RT PCR was 
performed as described below in Example 12 (Figure 7). 

5 Although intravenous inoculation of 0.4-0.625 mL of USP-1 sera into two cynomolgus 

macaques failed to produce infection (data not shown), inoculation of 2.0 mL of sera from 
patient US-2 resulted in viremia and elevations of liver enzyme levels in the serum (Figure 7). 
HEV RNA was first detected in fecal material on day 15 PI and remained positive through 64 
days PL Serum specimens collected between days 28-56 PI were HEV RNA positive. Elevated 

10 ALT values were noted on days 15, 44-58, 72 and 93 PI, with the peak ALT value (116 IU/L) 
on day 51 PL 

Six ELSIAs based on the Burmese, Mexican and US sequences for the 4-2 and 302e 
peptides were utilized to assess antibody response. Measurable response was found only to the 
US 3-2e peptide assay (Table 49) with no noted crossreactivity to the Burmese or Mexican 
15 peptides- IgM class antibody directed against HEV was detectable between 28 and 58 days PL 
This was followed by a strong anti-HEV-IgG response at day 44 PL 
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TABLE 49 



Date 


DPI 


ALT 


AST 


GGT 


IgG S/N 


06/04/97 


-82 


35 


37 


102 


1.4 


06/06/97 


-80 


39 


32 


90 




06/11/97 


-75 


38 


36 


100 




06/13/97 


-73 


36 


46 


86 




06/18/97 


-68 


45 


30 


85 




06/20/97 


-66 


43 


37 


87 




06/25/97 


-61 


37 


30 


92 




06/27/97 


-59 


42 


36 


87 




08/25/97 


0 


41 


36 


107 


1 


08/27/97 


2 










09/02/97 


8 


34 


34 


102 




09/04/97 


10 


34 


31 


91 




09/09/97 


15 


58 


42 


108 


0.8 


09/10/97 


16 


44 


45 


93 




09/15/97 


21 


35 


32 


86 




09/17/97 


23 


49 


71 


88 




09/22/97 


28 


39 


33 


86 


1.2 


09/24/97 


30 


40 


37 


90 




09/29/97 


35 


41 


40 


80 




10/01/97 


37 


48 


58 


90 


1.1 


10/03/97 


39 










10/06/97 


42 


45 


33 


89 




10/08/97 


44 


58 


38 


94 


6.2 


10/15/97 


51 


116 


62 


89 


11.9 


10/20/97 


56 


87 


38 


83 


33.6 


10/22/97 


58 


76 


43 


85 


29.9 


10/28/97 


64 


45 


42 


88 


17.2 


10/29/97 


65 


46 


34 


88 




11/03/97 


70 


39 


54 


85 




11/05/97 


72 


54 


47 


88 


13.3 


11/10/97 


77 


47 


33 


93 




11/12/97 


79 


50 


38 


93 


12.4 


11/17/97 


84 


46 


31 


91 


10.4 


11/19/97 


86 


52 


41 


88 




11/26/97 


93 


67 


104 


109 


7.2 


12/03/97 


100 


36 


36 


108 




12/09/97 


106 


38 


34 


115 




12/10/97 


107 


36 


29 


103 


2.1 



94 



Example 10: Recombinant Protein ELISAs 

A. Recombinant Constructs 

E. coli derived recombinant proteins encoded by HEV-US sequence from the ORF 2 
and ORF 3 regions of the HEV-US genome were expressed as fusion proteins with CMP-KDO 

5 synthetase (CKS), designated as pJOorf3-29 (SEQ ID NO:191); cksorf2m-2 (SEQ ID NO:192); 
and CKSORF32M-3 (SEQ ID NO:193), or as non-fusion proteins, designated as plorf3-12 ( 
SEQ ID NO:194); plorf2-2.6 (SEQ ID NO:195); and PLORF-32M-14-5 (SEQ IDNO:196). 
The cloning vector pJO201 ? as described in U.S. Patent No. 5,124,255, was used in the 
construction of the recombinant fusion proteins. This vector was digested with the restriction 

10 endonucleases Eco RI and Bam HI to allow cloning of HEV-US sequences in frame with CKS. 
The lambda pL expression vector pKRR826 was utilized in the construction of recombinant 
non-fusion proteins. This vector was digested with the restriction endonucleases Eco RI and 
Bam HI to allow for cloning of HEV-US sequences immediately down stream of the ribosome 
binding site. Since the vector system contains strong lambda promoter, induction of 

15 heterologous protein synthesis is accomplished by shift in the temperature from 30°C to 42°C 
which inactivates the temperature sensitive repressor protein. The constructs were cloned and 
transformed into E. coli K12 strain HS36 cells for the expression of these HEV proteins. 

HEV-US sequences were amplified from nucleic acids extracted from HEV US-2 human 
serum or macaque 13906 fecal material and reverse transcribed as described above in Example 

20 5. The ORF 2 sequence, encompassing the carboxyl half of ORF 2 (i.e., encoding amino acid 
residue numbers 334-660 of SEQ ID NO: 167), was generated using a sense primer, SEQ ID 
NO:208, which contained an Eco RI restriction site as well as an ATG start codon and an 
antisense primer, SEQ ID NO: 198, which contained a unique peptide sequence termed FLAG 
(Eastman Kodak), two consecutive TAA termination codons, and a Bam ///restriction site. A 

25 50 |lx! PCR reaction was set up using LA TAQ (Takara) reagents as recommended by the 
manufacturer. Cycling conditions involved 40 cycles of 94°C for 20 seconds, 55°C for 30 
seconds, 72°C for 2 minute. Amplifications were preceded by 1 minute at 94°C and followed 
by 10 minutes at 72°C. Products were digested with Eco RIand Bam HI and ligated into the 
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desired vector. The nucleotide sequence of the CKS fusion clone, between the restriction sites, 
is set forth in SEQ ID NO:192, the translation of which is set forth in SEQ ID NO:199. The 
nucleotide sequence of the non-fusion clone, between restriction sites, is set forth in SEQ ID 
NO:195, the translation of which is set forth in SEQ ID NO:200. The ORF 3 sequences, 
5 encompassing the entire ORF 3 (amino acids 1-122), was generated using a sense primer, SEQ 
ID NO:201, which contained an Eco ^/restriction site as well as an ATG start codon and an 
antisense primer, SEQ ID NO:202, which contained a unique peptide sequence termed FLAG, 
two consecutive TAA termination codons, and a Bam ///restriction site. A 50 p,L PCR 
reaction was set up using Qiagen reagents as described in Example 5. Cycling conditions 

10 comprised 35 cycles of 94°C for 30 seconds, 55°C for 30 seconds, 72°C for 1 minute. 

Amplifications were preceded by incubation for 1 minute at 94°C, followed by 10 minutes at 
72°C The resulting products were digested with Eco RI and Bam HI and ligated into the 
desired vector. The nucleotide sequence of the CKS fusion clone, between the restriction sites, 
is set forth in SEQ ID NO: 191, the translation of which is set forth in SEQ ID NO:203. The 

15 nucleotide sequence of the clone representing the non-fusion construct, between the restriction 
sites, is set forth in SEQ ID NO: 195, the translation of which is set forth in SEQ ID NO:204. 

Additionally, a chimeric construct encompassing the full length ORF 3 (amino acids 1- 
123) and the carboxyl half of ORF 2 (amino acids 334-660) was generated. Approximately 100 
ng of the plasmids containing SEQ ID NO: 191 and SEQ ID NO: 192 were utilized as template 

20 in 1 00 \iL PCR reactions. PCR buffers and enzymes were from the LA TAQ kit (Takara), and 
used in accordance with the manufacturer's instructions. ORF 3 was amplified with primers set 
forth in SEQ ID NOS:201 and 205. The antisense primer of SEQ ID NO:205 eliminates the 
FLAG sequences and stop codons from the carboxyl end of SEQ ID NO: 191 and contains the 
sequence identical to SEQ ID NO: 192 which will eliminate the ATG start codon. ORF 2 was 

25 amplified with primers of SEQ ID NOS:208 and 198. Cycling conditions were as described 
above using LA TAQ. The resulting products were fractionated on a 1.2% agarose gel and 
excised. DNA was isolated from the gel slices using GeneClean II as described by the 
manufacturer (Bio 101). Products were eluted off the glass beads into 15 ^L H 2 0. 
Approximately equal molar ratios of each product (10 |tiL of ORF 3 product and 1 (iL of ORF 2 

30 product) were mixed in a 25 \xL end fill reaction using lx PCR buffer, 0.5 jul dNTPs, and 0.25 
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jlxL LA TAQ (Takara). This reaction was cycled as follows: 94°C for 1 minute, 10 cycles of 
94°C for 20 seconds, 55°C for 30 seconds, and 72°C for 1.5 minutes, followed by 72°C for 10 
minutes. 5 jaL of this reaction was placed into a 100 jiL amplification reaction utilizing LA 
TAQ kit (Takara) and primers of SEQ ID NOS:201 and 198. Cycling conditions were 94°C for 

5 1 minute followed by 35 cycles of 90°C for 20 seconds, 55°C for 30 seconds, and 72°C for 1.5 
minutes. This was followed by 10 minutes at 72°C and a 4°C soak. Products of the appropriate 
size were digested with restriction enzymes Eco RI and Bam HI, This product was ligated into 
pJO201 and clones with the appropriate sequence identified (SEQ ID NO: 193, the translation of 
which is set forth in SEQ ID NO:206). The resulting product was ligated into pKRR826 and 

10 clones with the appropriate sequence (SEQ ID NO: 196, the translation of which is set forth in 
SEQ ID NO:207) identified. 

B. Protein expression and purification 

The CKS constructs were expressed in two 500 mL cultures (4 hour induction), as 
described in U. S. Patent No. 5,312,737. P L constructs were expressed as described above. 

15 Frozen cell pellets of the induced E.coli cultures were used as the starting material for the 

purification of protein. Cells were lysed in buffer containing lysozyme, DNase and proteinase 
inhibitors. Soluble protein was separated from insoluble (inclusion body) protein by 
centrifugation at 1 1,000 x g. The solubility of the recombinant protein was estimated via 
sodium dodecyl sulfate (SDS) polyacrylamide gel electrophoresis (PAGE) and Western blotting 

20 using a FLAG® M2 antibody. 

Soluble recombinant protein was purified by affinity chromatography using FLAG® M2 
antibody affinity gel after exchange into suitable buffer (Surowy et al. (1997) Journal of 
General Virology, 78: 1 85 1-1 859). If necessary, additional purification was performed via 
Sephacryl® S-200 gel filtration chromatography, in which the sample and chromatography 
25 buffers contained 10 mM (5-mercaptoethanoL Purified protein was quantitated by measurement 
of absorbance at 280 nm. An assumed extinction coefficient of 1 was used to convert 
absorbance to mg of protein. Protein purity was determined by scanning densitometry 
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(Molecular Dynamics) of protein fractioned by SDS PAGE, using standards of pre-determined 
purity. 

C. ELISA 

In order to determine potential utility of the recombinant HEV US constructs, solid phase 
5 ELISA's were developed and evaluated. All recombinant HEV US proteins were coated onto 
solid phase as described below. Briefly, 1/4" polystyrene beads were coated with varying 
amounts of (PJOORF3-29) which ranged in concentration from 0.5 to 10 jag/mL diluted in 100 
mM sodium phosphate buffer, pH 7.6. Sixty beads per concentration condition were coated in 
approximately 14 mL of buffer and rotated end-over-end at 40° C for 2 hours. The coating 
10 solution was aspirated and the remainder of the coating procedure was performed as described 
above in Example 8, section E, paragraph 1. 

An ELISA was developed using the pJOorB-29 coated beads. Briefly, sera or plasma 
was diluted 1 :16 in Specimen Diluent (SpD) as described above. A 10 jiL aliquot of this pre- 
diction then was added into the well of a reaction tray, followed by the addition of 200 |aL of 

15 SpD. One coated bead was added per well and incubated for 1 hour at 37°C in dynamic mode 
using a Dynamic Incubator (Abbott Laboratories). After incubation, the fluid was aspirated and 
each bead washed 3 times with deionized water (5 mL per wash). The beads then were 
incubated with 200 |aL HRPO-labeled goat anti-human IgG or IgM conjugate, diluted in 
conjugate diluent (described above) and incubated for 30 minutes at 37°C. The conjugate then 

20 was aspirated and the beads washed as above. Color development and absorbance readings 
were performed as described in Example 8, section E. 

To validate the immunoreactivity of this construct, serial bleed specimens from 
Macaque #13903 experimentally infected with HEV US-2 (described in Example 9) were tested 
for IgM and IgG antibody to pJOorf3-29. As shown in Figure 1, IgM antibody was detected at 
25 day 5 1 post-infection (PI) and continued to be elevated through day 72 and corresponded to the 
peak elevations in ALT values. IgG antibody to pJOorf3-29 was first detected on day 56 PI and 
remained positive through day 107 (Table 50). 
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A second construct, plorO-12 ? representing HEV US ORF 3 but lacking the CKS fusion 
partner was also evaluated in an ELISA format identical to that described above. IgG antibody 
to plorf3-12was evaluated on several serial bleeds from the same experimentally infected 
macaque. IgG antibody to plorf3-12was detected on day 58 PI and remained positive through 
5 day 107 (Table 50). 



TABLE 50 





pJOorC-29 


plorf3-12 




Sample 


Mean 


O/ IN 


Mean OD 


S/N 




OD 








SpD 






0.01 




"pre-bleed" 


0.02 




0.01 




Post-inoculation bleeds 


- Days Post- 






inoculation (DPI) 








DPI 










44 


0.02 


0.96 


0.02 


1.07 


51 


0.05 


2.35 


0.03 


2.25 


56 


0.24 


10.35 


0.05 


3.43 


58 


0.44 


19 


0.16 


11.57 


63 


1.14 


49.57 


0.32 


22.82 


65 




NT 


0.53 


37.54 


70 




NT 


1.19 


85.04 


72 


2.22 


96.52 


0.92 


65.71 


98 


0.89 


38.87 


0.39 


27.86 


107 


0.49 


21.43 


0.27 


19.36 


NT: not tested 









Due to the high percent homology between Swine HEV and the US-2 isolate, the 
pJOorf3-29 ELISA also was used to measure the prevalence of both immunoreactive IgG and 
IgM in sera isolated from U.S. swine herds (Table 51). The assay was performed as described 
above with the exception of substituting HRPO-conjugated labeled anti-swine immunoglobulin 
10 (either IgG or IgM) for the anti-human conjugate. 
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TABLE 51 



Prevalence of Antibody to HEV or£3 in U. S. Swine 
(pJOorf3-29) 


Swine 
Source 
State 


IgG 
Reactive 
No./Total 

(%) 


No. IgG 
Confirmed by 
Blocking or Blot 

(%) 


IgM Only 
Reactive 
No./Total 
(%) 


No. IgM 

Only 
Confirmed 
by Blot 
(%) 


Total 

Exposure 
Confirmed 
Only 


New Jersey 


9/14 
(64) 


9 

(100) 


0/14 




64% 


Texas 


25/50 
(50) 


20 
(80) 


0/50 




40% 


Iowa 


7/64 
(11) 


1 

(14) 


0/64 




2% 


Oregon 


7/36 
(19) 


5 
(71) 


1/36 
(3) 


1/1 
(100) 


14% 


Total 


48/164 
(29) 


35 
(73) 


1/164 
(0.6) 


1/1 
(100) 


36/164 
(22%) 


NOTE: Atota 


of 4 pigs (all Texas herd) had I& 


VI in addition to 


[gG. 



In order to confirm reactive specimens, a blocking assay was developed. Briefly, a 10 
]uL aliquot of the 1 : 16 specimen pre-dilution was added to duplicate wells of a reaction tray; 
one well to be used for the standard assay and one well to be used for the blocking assay. The 
ELISA for the standard assay was performed as described above with the exception that there 
was a 30 minute room temperature pre-incubation step prior to addition of the pJOorf3-29 
antigen coated bead. For the blocking assay, pJOorf3-29 was added to the SpD (blocking 
reagent) at a 10-fold molar excess to that on the solid phase. 200 jaL of blocking reagent was 
added per reaction and a 30 minutes room temperature pre-incubation was performed prior to 
addition of the pJOorf3-29 antigen coated bead. The rest of the assay was performed as 
described above for the swine assay, except that the HRPO-conjugated anti-swine conjugate 
(IgG) was used in place of the anti-human conjugate. 
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The % blocking was determined using the equation: 

[(A 492 nm standard assay - A 492 nm blocking assay )/A 492 nm standard assay] x 100 

Specimens that showed blocking rates of 50% or greater were considered to be reactive for IgG 
antibody to HEV pJOorf3-29. Representative IgG positive and IgG negative swine samples 
and their blocking results are shown in Table 52. 

Table 52 - Blocking Assay With pJOorB-29 and PL- 12 at 10-fold molar excess 











: : :-;M:lBl0ck 




fwfvlQomm at 




















SAMPLE 


OD 


MEAN 
OD 


OD 


MEAN 
OD 


% 

BLOCKING 








0.02 




0.02 










NC 


0.02 
1.09 


0.02 


0.03 
0.56 


0.02 








PC 


1.01 


1.05 


0.48 


0.52 


50.4% 








3S ; • u<b 








1 


NJ5 


0.65 




0.15 




76.5% 


+ 


2 


NJ12 


1.78 




0.46 




74.0% 


+ 


3 


NJ21 


0.48 




0.16 




66.7% 


+ 


4 


NJ23 


0.52 




0.09 




81.9% 


+ 


5 


T5 


2 




0.81 




59.5% 


+ 


6 


T9 


0.52 




0.18 




64.3% 


+ 


7 


T32 


2 




0.9 




54.9% 


+ 


8 


T33 


0.3 




0.13 




57.8% 


+ 


9 


T48 


0.53 




0.14 




73.7% 


+ 


10 


T49 


0.33 




0.09 




73.3% 


+ 


Oregon Swine 


JpSlNegativ.es 






; r ; ' * , ^iJllfc^ 




11 


T43 


0.08 




0.07 




13.3% 




12 


T46 


0.12 




0.08 




29.1% 




13 


1-23 


0.12 




0.08 




32.2% 




14 


1-24 


0.07 




0.06 




13.2% 




15 


1-27 


0.1 




0.08 




12.6% 




16 


1-28 


0.15 




0.12 




20.4% 




17 


1-33 


0.15 




0.12 




19.9% 




18 


1-39 


0.23 




0.14 




37.4% 




19 


1-61 


0.19 




0.14 




25.9% 




20 


0-4 


0.15 




0.12 




22.7% 
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In addition to the blocking assay, western blots were run on a subset of swine 
specimens. Briefly, 50 (ig of HEV pJOorf3-29 and 50 jig of "CKS only" proteins were 
fractionated by SDS-PAGE and the fractionated proteins transferred to nitrocellulose. 3mm 
strips of the nitrocellulose were cut and incubated overnight at room temperature on an orbital 

5 rotator with primary antibody at a 1 : 1 00 dilution in protein based buffer containing 1 0% E. coli 
lysate. On the following day, strips were washed three times with 03% Tween/TBS (TBST), 
followed by the addition of HRPO-conjugated anti-swine IgG conjugate diluted to 0.5 jag/mL 
in TBST. Strips were incubated with rotation for 4 hours at room temperature. Blots then were 
washed three times in TBST, followed by 2 washes in TBS. Blots were developed using 

10 4-chloro-l -naphthol as a substrate. The reaction was stopped by the addition of water and band 
intensities recorded. Specimens were determined to have specific reactivity to HEV if they 
showed a band at the correct molecular weight for pJOorf3-29 (approx. 40 kD) and had no 
reactivity in the region where M CKS only" bands (approx. 29 kD). Results for 20 swine sera run 
on the pJOorf3-29 western blot are shown in Table 53. No swine sera showed non-specific 

15 reactivity with the "CKS-only" band. 
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TABLE 53 





BAND INTENSITY 


Swine ID Number 


pJOori3-29 


CKb only 








NJ4 






NJ7 


+ 




NJ14 


+++ 




NJ18 






NJ25 


++++ 




T6 


_i i_ 




T10 

1 1U 


_i i i i_ 

i 1 i r 




T14 

1 






T1 S 


+ 




T1 8 
1 1 o 


++ 




TO ft 
IZo 






T29 






T30 


+ 




T34 






T36 


++++ 




T37 






T43 






T44 


++++ 




T45 


++++ 




T46 







These data suggest that HEV US recombinant proteins are useful in diagnosing exposure to 
HEV. 

Example 11 - Consensus Primers 

Consensus oligonucleotide primers for HEV ORF 1 ORF 2 and ORF 3 were designed 
5 based on conserved regions between the full length sequences of isolates from Asia, Mexico, 
and the US (Figure 9). The ORF 1 primers are positioned within the methyltransferase region 
at nucleotides 56-79 and 473-451 of the Burmese isolate (GenBank accession number 
M73218), and amplify a product 418 nucleotides in length. The ORF 1 primers include: 



10 HEVConsORF 1-sl; CTGGCATYACTACTGCYATTGAGC (SEQ ID NO:147); and 

HEVConsORF 1-al; CCATCRARRCAGTAAGTGCGGTC (SEQ ID NO: 148). 
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The ORF 2 primers, at positions 6298-6321 and 6494-6470 of the Burmese isolate, 
produce a product 197 nucleotides in length. The ORF 2 primers include: 

5 HEVConsORF 2-sl; GACAGAATTRATTTCGTCGGCTGG (SEQ ID NO: 150); and 

HEVConsORF 2-al; CTTGTTCRTGYTGGTTRTCATAATC (SEQ ID NO: 126). 

For a second round of amplification, internal primers can be used to produce products 
287 and 145 nucleotides in length for ORF 1 and ORF 2, respectively. The ORF 1 primers 
10 include: 

HEVConsORF l-s2; CTGCCYTKGCGAATGCTGTGG (SEQ ID NO: 177); and 
HEVConsORF l-a2; GGCAGWRTACCARCGCTGAACATC (SEQ ID NO: 178). 

The ORF 2 primers include: 

HEVConsORF 2-s2; GTYGTCTCRGCCAATGGCGAGC (SEQ ID NO: 152); and 
15 HEVConsORF 2-a2; GTTCRTGYTGGTTRTCATAATCCTG (SEQ ID NO: 128). 

PCR reactions contained 2 mM MgCl 2 and 0.5 \iM of each oligonucleotide primer as 
per the manufacturer's instructions (Perkin-Elmer) and amplified using Touch-down PCR as 
described in Example 5. Amplified products were separated on a 1 .5% agarose gel and 
analyzed for the presence of PCR products of the appropriate size. The primers were used to 

20 detect the presence of virus in serum and feces containing HEV US-2 as described above in 

Example 8 and Figure 7. In addition, these primers were found to be reactive with a number of 
different variants of HEV that included Burmese-like strains 6 A, 7A ? 9A and 12 A as well as 
two distinct isolates from Greece (see Example 13 below) as well as a unique isolate from Italy 
and the two isolates from the US (see Example 13 below). In addition, these primers have been 

25 used to identify an isolate from a patient with a clinical diagnosis of acute sporadic hepatitis 
from the Liaoning province of China (SI 5). The results are presented in Table 54 below. 
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TABLE 54 



Sample 


ORF 1 -PCR1 


ORF 1 -PCR 2 


ORF 2 - PCR1 


ORF 2 -PCR2 


6A 


neg 


pos 


pos 


Pos 


7A 


neg 


pos 


neg 


Pos 


9A 


neg 


neg 


neg 


Pos 


12A 


pos 


pos 


neg 


Neg 


Gl 


pos 


pos 


pos 


Pos 


G2 


pos 


pos 


pos 


Pos 


Itl 


pos 


pos 


pos 


Pos 


S15 


nd 


pos 


nd 


Pos 


US-2 


pos 


pos 


pos 


Pos 



Example 12 - Detection of HEVRNA in Primary Human Fetal Kidney Cells 

Frozen cell pellets containing lOxlO 6 cells were thawed and resuspended in 1 .0 mL 
Dulbecco's phosphate buffered saline. RNA was extracted from 20 (iL (2xl0 5 cells) of the cell 
5 pellet using the Ultraspec Isolation System as described in Example 1 . cDNA synthesis was 
performed on the above extracted nucleic acid (RNA) and primed with random hexamers. PCR 
then was performed on the above cDNA using degenerate primers from the ORF-1 and ORF -2 
regions of the viral genome at a final concentration of 0.5 (J.M as described in Example 1 1 . 

To monitor the performance of the above assay, a positive control utilizing primary 
10 human kidney cells and HEV US-2 positive serum was included in the experimental design. 

Two positive control sets were prepared by spiking 2x1 0 5 HEV negative primary human kidney 
cells with 2.5 jiL and 25 jiL of a documented HEV US-2 positive serum specimen. The 
positive control serum also was tested without the addition of the human kidney cells. 

Nineteen primary human kidney cell pellet lots were tested using the above assay 
15 method utilizing the 2 degenerate primer sets from ORF 1 and ORF 2. The results are 

summarized in Table 55 below. None of the cell pellet lots tested gave positive results as seen 
in the positive controls. 
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TABLE 55 
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1946 
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cells + 25 uL serum 


+ 




cells + 2.5 uL serum 


+ 




25 uL serum 


+ 



Example 13: Identification and Extension of Additional US-tvpe Isolates 

A. Identification of isolate from Italy, referred to as Itl 

RNA was extracted from 25 to 50 jaL of serum using the QIAamp Viral RNA kit 
(Qiagen) as described by the manufacturer except that 25 to 50lxL of serum was diluted to 
5 IOOjliL with PBS and the final elution was performed with 100 jiL of RNase-free water. RT 
reactions were random primed. PCR utilized the HEV US-1 primer as described hereinabove in 
Example 5. A 294 bp product was generated after amplification with primers SEQ ID NO:94 
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and SEQ ID NO:96. The product was cloned and sequenced as described in Example 3 and is 
shown in SEQ ID NO: 1 79. 

Extension of the Itl isolate genome was performed as follows. RNA was extracted from 
25 to 50 \xL of serum as described hereinabove in Example 5. RT reactions were random 
5 primed. PCR utilized the HEV CONSENSUS primers described above in Example 1 1 using 
touchdown PCR, as described hereinabove in Example 3. Primers shown in SEQ ID NOS:147 
and 148 were used to generate a product having the sequence set forth in SEQ ID NO: 180 
(reaction z2, 418 bp). Primers as shown in SEQ ID NOS:150 and 126 were used to generate a 
product having the sequence set forth in SEQ ID NO: 181 (reaction z3, 197 bp). In the presence 

10 of lx PCR Buffer and 20% Q Solution (Qiagen), primers as shown in SEQ ID NOS: 1 82 and 
183 were used to generate a product having a sequence set forth in SEQ ID NO: 184 (reaction 
z4, 234 bp). The 3' end of the genome was isolated by 3' RACE as described above in 
Example 3 using primers shown in SEQ ID NOS: 150 and 85 in PCR1, and primers shown in 
SEQ ID NOS: 152 and 85 in PCR2, to produce a product having the sequence shown in SEQ ID 

15 NO: 185 (reaction z5, 890 bp). Products were cloned and sequenced as described in Example 3 
and consensus sequences generated. These regions are shown in Figure 8 and are set forth in 
SEQ ID NOS: 1 80, 1 84 and 1 86. The amino acid translations of these regions are represented 
by the amino acid sequences set forth in SEQ ID NOS:187, 188; 189; 190; and 197. 

20 B. Identification of two isolates from Greece, referred to as Gl and G2 

Two patients with acute hepatitis who had no history of travel to endemic areas had 
been analyzed with primers based on the Burmese isolate (Psichogiou M.A., et al^ (1995) 
"Hepatitis E virus (HEV) infection in a cohort of patients with acute non-A, non-B hepatitis," 
Journal of Hepatology, 23, 668-673). Only patient G2 was found to be PCR positive. RNA 
25 was isolated as described hereinabove in Example 12 and PCR performed with the consensus 
primers described above in Example 1 1 . The ORF 1 and ORF 2 primer sets generated products 
of the expected size from both patients. The products were cloned and sequenced as described 
above in Example 3. The products generated using the ORF 1 and ORF 2 consensus primers 
from patient Gl are shown in SEQ ID NOS:209 and 21 1, respectively. The products generated 



107 

using the ORF 1 and ORF 2 consensus primers from patient G2 are shown in SEQ ID NOS:213 
and 215, respectively. The identification of Gl as being PCR positive demonstrates the utility 
of the consensus primers over Burmese base strain specific primers. 

Additional sequence from Gl and G2 was also obtained using primers SEQ ID NO: 16, 
5 SEQ ID No:17, and SEQ ID NO:18 as for the generation of SEQ ID NO:19 as described above 
in Example 3 except that random primed cDNA was used for PCR and amplification involved 
10 cycles of 94°C for 20 seconds, 60°C for 30 seconds, and 72°C for 1 minute, followed by 10 
cycles of 94°C for 20 seconds, 55°C for 30 seconds, and 72°C for 1 minute followed by 30 
cycles of 94°C for 20 seconds, 50°C for 30 seconds (-0.3°C/cycle), and 72°C for 1 minute. 
10 This was followed by an extension cycle of 72°C for 7 minutes. The product generated from 
patient Gl is shown in SEQ ID NO:217. The product generated from patient G2 is shown in 
SEQ ID NO:220. 

Alignments of the nucleotide sequences of the US, Chinese, Greek, Italian, Mexican and 
Burmese-like isolates, were performed to determine the relationship of these isolates to each 

15 other. The divergence of the Italian isolate is supported by the comparisons of the product from 
the ORF 1 region of the genome which has a percent nucleic acid identity of 77.6 %, 78.4 %, 
and 84.6 % with the prototype isolates from Burma (Bl), Mexico (Ml) and the US (US-1), 
respectively (Table 36). The divergence of the Italian isolate also is supported by the 
comparisons of the product from the ORF 2 region of the genome which had a percent nucleic 

20 acid identity of 83.3 %, 79.7 %, and 87.8 % with the prototype isolates from Burma, Mexico 
and the US, respectively (Table 37). The nucleotide identities between the prototype isolates 
from Burma, Mexico and the US, range between 75.5 % to 82.4 % over these two regions. 
Over these same regions, the isolates that comprise the Burmese-like group have much higher 
identities of 91.2% or greater. 

25 Comparisons of the ORF 1 and ORF 2 amplified sequences indicate that the isolates from 

the two patients from Greece are quite distinct from each other, exhibiting 84.4 % and 87.2 % 
nucleotide sequence identity over these regions of ORF 1 and ORF 2, respectively. At the 
nucleotide level, the percent identities between the Greek, Italian and US isolates range from 
81.9% to 86.8% for the ORF 1 product (Table 36) and 82.4% to 87.8% for the ORF 2 product 
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(Table 37). These values are lower than the lowest percent nucleotide identities between any 
Burmese-like isolates, which are greater than 91 .2% for both ORF 1 and ORF 2. Comparisons 
of the amino acid identities derived from the ORF 1 fragment between the US , Italian or Greek 
isolates and the Burmese or Mexican isolates range from 87.8% to 93.5 % (Table 36). These 
5 values are equal to or less than the differences between the Burmese and Mexican isolates 

(93.5% to 95.1 %) (Table 36), indicating that the isolates from non-endemic regions are distinct 
from the isolates originating from endemic regions. 

The relative evolutionary distances between the viral sequences analyzed are readily 
apparent upon inspection of the unrooted phylogenetic trees generated from the pairwise 

10 distances, where the branch lengths are proportional to the relative genetic relationships 

between the isolates. The phylogenetic trees based on alignments of either ORF 1 (Fig. 10) or 
ORF 2 (Fig. 1 1) sequences are quite similar in overall topology. The Burmese-like isolates and 
the Mexican isolate represent major branches at one end of the tree. The human US isolates 
form a distinct group distal to the Mexican and Burmese isolates. The swine HEV-like 

15 sequence from ORF 2 is closely related to the US human isolates. The three European isolates 
form three additional distinct branches with the Italian isolate being most closely related to the 
US isolates. 

Example 14: Identification Additional US-type Isolates from Austria and Argentina 

20 RNA was isolated from serum from three patients with acute hepatitis who had no 

history of travel to areas considered endemic for HEV as described hereinabove in Example 12 
and PCR performed with the consensus primers described above in Example 1 1 . One patient 
was from Austria, Aul, (Worm, et al, (1998) "Sporadic hepatitis E in Austria," New England 
Journal of Medicine, 339, 1554-1555) while the other two patients were from Argentina. The 

25 ORF 1 and ORF 2 primer sets generated products of the expected size from all patients. The 
products were cloned and sequenced as described above in Example 3. The products generated 
using the ORF 1 and ORF 2 consensus primers from patient Aul are shown in SEQ ID 
NOS:243 and 245, respectively. The products generated using the ORF 1 and ORF 2 
consensus primers from patient Arl are shown in SEQ ID NOS:247 and 249, respectively. The 
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products generated using the ORF 1 and ORF 2 consensus primers from patient Ar2 are shown 
in SEQ ID NOS:251 and 253, respectively. PCR products were obtained after both the first 
round of ORF1 PCR with the al and si primers as well as the second round of nested ORF1 
PCR with the a2 and s2 primers for Aul , Ar 1 and Ar2. PCR products were obtained after both 
5 the first round of ORF2 PCR with the al and si primers as well as the second round of nested 
ORF2 PCR with the a2 and s2 primers for Aul and Ar2. Product from Arl was detected only 
after the second round of nested ORF2 PCR with the a2 and s2 primers. 

Alignments of the nucleotide sequences of the US, Chinese, Greek, Italian, Austrian, 
Argentine, Mexican and Burmese-like isolates, were performed to determine the relationship of 

10 these isolates to each other as described in Example 6. The divergence of the Austrian isolate, 
Aul, is supported by the comparisons of the product from the ORF 1 region of the genome 
which has a percent nucleic acid identity of 77.1 %, 78.2 %, and 87.9 % with prototype isolates 
from Burma (Bl), Mexico (Ml) and the US (US-1), respectively (Table 56). The divergence of 
the Austrian isolate also is supported by the comparisons of the product from the ORF 2 region 

15 of the genome which had a percent nucleic acid identity of 85.1 %, 79.1 %, and 83.1 % with the 
prototype isolates from Burma (Bl), Mexico (Ml) and the US (US-1), respectively (Table 57). 
The divergence of the Argentine isolate, Ar2, is supported by the comparisons of the product 
from the ORF 1 region of the genome which has a percent nucleic acid identity of 76.0 %, 76.0 
%, and 84.9 % with the prototype isolates from Burma (Bl), Mexico (Ml) and the US (US-1), 

20 respectively (Table 56). The divergence of the Ar2 isolate also is supported by the comparisons 
of the product from the ORF 2 region of the genome which had a percent nucleic acid identity 
of 85.8 %, 82.4 %, and 85.8 % with the prototype isolates from Burma (Bl), Mexico (Ml) and 
the US (US-1), respectively (Table 57). The divergence of the Argentine isolate, Arl, is 
supported by the comparisons of the product from the ORF 1 region of the genome which has a 

25 percent nucleic acid identity of 76.6 %, 77.6 %, and 85.7 % with the prototype isolates from 
Burma (Bl), Mexico (Ml) and the US (US-1), respectively (Table 56). The nucleotide 
identities between the prototype isolates from Burma (Bl), Mexico (Ml) and the US (US-1), 
range between 75.5 % to 82.4 % over these two regions. Over these same regions, the isolates 
that comprise the Burmese-like group have much higher identities of 91.2% or greater. 

30 Although only a nested ORF2 PCR product was obtained from the Argentine isolate, Arl, the 
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divergence of the Ar2 isolate also is supported by the comparisons of this smaller product from 
the ORF 2 region of the genome which had a percent nucleic acid identity of 80.6 % with the 
prototype isolates from Burma (Bl), Mexico (Ml) and the US (US-1) (Table 57). 

At the nucleotide level, the percent identities between the Austrian, Argentine, Greek, 
5 Italian and US isolates (excluding the identity between US-1 and US-2) range from 80.6% to 
89.8% for the ORF 1 product (Table 56). At the nucleotide level, the percent identities between 
the Austrian, Argentine, Greek, Italian and US isolates (excluding the identity between US-1 
and US-2 and Ar-1 and Ar-2) range from 80.6% to 89.2% for the ORF 2 product (Table 57). 
These values are lower than the lowest percent nucleotide identities between any Burmese-like 
10 isolates, which are 91 .2% or greater for ORF 1 and ORF 2. 



Ill 



PQ 

< 



CD 

a 

so 



3 

O 

a 

d> 
o 

CD 
O 

2 



CO 



so 



on 



ON 



on m 



on 



on on 



on 



ON 



CD 
<D 

o 

CD 

1 



c4 <=> 

u 2 



CO 

ON 



co 



oo 



S3 



in 
oo 



ON 
CO 



o 

ON 



ON 



ON 



oo 

ON 



o 



CD 



ON 
CO 



3 

< 



CD 
i—i 

3 
< 

o 

a 
< 



ON ~ 
OO ON 



112 



o 
o 



80.6 


82.4 


79.1 


81.1 


79.1 


79.7 


79.7 


81.8 


83.8 


82.4 


80.4 


81.8 


82.4 


82.4 


81.8 


83.8 


83.8 


83.1 


1 M1 1 


80.6 


86.5 


82.4 


83.8 


84.5 


82.4 


77.7 


81.1 


1 83.8 


93.9 


93.2 


96.6 


98.6 


98.6 


97.3 


95.9 


91.2 




| 95.9 


79.6 


85.1 


83.1 


82.4 


83.1 


81.8 


78.4 


81.8 


83.1 


97.3 


95.3 


91.9 


92.6 


92.6 | 


91.9 


ON 

CO 
ON 


£ 


98.0 


| 93.9 


82.7 


85.1 


84.5 


83.1 


oq 

CO 

oo 


83.8 


77.0 


| 80.4 1 


83.1 


96.6 


1 95 - 9 1 


96.6 


| 97.3 | 


1 97.3 | 


96.6 




| 98.0 


o 
o 


| 95.9 


80.0 


84.5 


83.1 


82.4 


83.8 | 


82.4 


1 7 6.4 | 


| 79.3 | 


82.4 


| 94.6 


| 93,9 


| 96.6 


| 98.6 


| 98.6 


u 


o 
o 


| 98.0 


oot 


| 95.9 


80.6 


85.1 


83.8 


83.1 


83.8 | 


83.1 


i 77.0 | 


| 80.4 | 


83.1 


| 95.3 


| 94.6 


| 98.0 


o 
o 


| C3 


O 

o 


o 
o 


| 98.0 


o 
o 


ON 

uo 

ON 


80.6 


85.1 


83.8 1 


83.1 | 


83.8 1 


83.1 


77.0 | 


80.4 | 


83.1 


95.3 | 


94.6 | 


[ 98.0 ! 


: ZD 1 


o 
o 


o 
o 


o 
o 


oo 

ON 


o 
o 


1 95.9 


80.6 


83.8 


83.1 | 


82.4 | 


82.4 | 


82.4 


76.4 1 


| 79.7 | 


82.4 


| 94.6 | 


| 93.9 | 


U 


o 
o 


o 
o 


o 
o 


o 
o 


| 98,0 


o 
o 


95.9 


80.6 


85.1 


83.8 | 


82.4 | 


84.5 | 


83.1 


i 78.4 1 


P80.4 | 


84.5 


oo 
On 


1 B2 


| 98.0 


| 98.0 


| 98.0 


98.0 


| 98.0 


| 95.9 


98.0 


93.9 


80.6 


85.8 


85.1 | 


84.5 | 


85.1 | 


83.8 


79.0 | 


82.4 | 


83.8 


1 B1 1 


1 98.0 J 


o 
o 


o 
o 


o 
o 


o 
o 


o 
o 


| 98.0 


o 
o 


95.9 


87.8 I 


90.5 


87.8 | 


85.1 | 


87.8 1 


85.8 


90.5 1 


91.2 J 


55 


i 959 1 


1 93.9 | 


1 95 ' 9 


1 95 - 9 


1 919 J 


95.9 


1 95.9 


I 93.9 


95.9 


95.9 


82.7 1 


85.1 


85.8 | 


to 
oo 


85.1 1 


85.8 


93.9 1 


US-2 1 


o 
o 


95.9 | 


93.9 1 


95.9 1 


1 959 i 


1 959 1 


95.9 


I 9 5.9 ! 


1 93.9 


95.9 


95.9 


80.6 1 


85.8 


83.1 ] 


84.5 1 


82.4 1 


87,8 


1 US-1 | 


o 
o 


o 
o 


1 95.9 1 


I 93.9 1 


1 95 - 9 1 


1 95 ' 9 1 


1 959 1 


95.9 


1 95 - 9 


1 93.9 


ON 

in 

ON 


95.9 


83.7 1 


87,2 


a* 

CO 


oq 
oo 


83.1 


+■» 


98.0 


1 98.0 1 


98.0 


1 98.0 


1 95,9 


98.0 


1 98.0 


98.0 


0*86 


98.0 


95.9 


98.0 


98.0 


82.7 1 


86.5 


88.5 1 


87.2 


ZD 


o 
o 


98.0 


98.0 


98.0 


98.0 1 


95.9 


98.0 


98.0 | 


98.0 


98.0 


98.0 


95.9 


98.0 


98.0 


81.6 1 


83.8 


83.1 


3 


o 
o 


o 
o 


98.0 


98.0 


98.0 


98.0 


95.9 


98.0 


98.0 


98.0 


98.0 


98,0 


95.9 


98.0 


98.0 


87.8 1 


88.5 


Aul 


o 
o 


o 
o 


o 
o 


98.0 


i 98.0 


98.0 


98.0 


On 
uo 

ON 


98.0 


98.0 


98.0 


98.0 


98.0 


95.9 


98.0 


98.0 


91.8 1 


Ar2 


o 
o 


o 
o 


o 
o 


o 
o 


98.0 


j 98.0 


98.0 


98.0 


95.9 


98.0 


98.0 


98.0 ! 


98.0 


98.0 


95.9 


98.0 


98.0 


1 Arl 1 


o 
o 


o 
o 


o 
o 


o 
o 


o 
o 


96.9 


96.9 


96.9 


6*96 


96.9 


96.9 


96.9 


96.9 


96.9 ' 


96.9 


93.8 


6*96 


ON 

vd 

ON 



113 

Comparisons of the ORF 1 and ORF 2 amplified sequences indicate that the isolates from 
the two patients from Argentina are quite distinct from each other, exhibiting 88.4 % and 91 .8 
% nucleotide sequence identity over these regions of ORF 1 and ORF 2, respectively. The 
value for ORF1 is lower than the lowest percent nucleotide identities between any Burmese- 
5 like isolates, which is 91.4%. for ORF 1. However for ORF2, the nucleotide identity of 91.8% 
between the two isolates from Argentina is in the range observed for identities between the 
Burmese-like isolates and ORF 2, which may be due to the shorter length of the fragment. 

Phylogenetic analyses were performed as described in Example 7. The relative 
evolutionary distances between the viral sequences analyzed are readily apparent upon 
inspection of the unrooted phylogenetic trees generated from the pairwise distances, where the 
branch lengths are proportional to the relative genetic relationships between the isolates. The 
phylogenetic trees based on alignments of either 371 nucleotides from ORF 1 (Fig. 14), 148 
nucleotides from ORF 2 (Fig. 15) which excludes Arl, or 98 nucleotides from ORF 2 (Fig. 16), 
which includes Arl, are quite similar in overall topology. The Burmese-like isolates and the 
Mexican isolate represent major branches at one end of the tree. The human US isolates form a 
distinct group distal to the Mexican and Burmese isolates. The swine HEV-like sequence is 
closely related to the US human isolates. The four European isolates and two Argentine 
isolates also form branches distal to the Mexican and Burmese isolates. The major branch 
between the US-type isolates, represented by the US, Greek, Italian, Austrian and Argentine 
isolates, and the Burmese-like and Mexican isolates is supported by a bootstrap value of 75.7% 
and greater in all trees. 

Example 15: New Degenerate Primers 

Degenerate primers derived from consensus oligonucleotide primers for HEV ORF 1 
25 and ORF 2 were designed based on conserved regions between the full length sequences of 

isolates from Asia, Mexico, US as described in Example 1 1, as well as isolates from Greece and 
Italy. The ORF 1 primer is positioned within the methyltransferase region at nucleotides and 
473-451 of the Burmese isolate (GenBank accession number M73218), and amplifies a product 
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417 nucleotides in length when used in combination with HEVConsORF 1-sl, SEQ ID 
NO: 147; as described in Example 1 1. The new ORF 1 primer combination includes: 

HEVConsORF 1-sl; CTGGCATYACTACTGCYATTGAGC (SEQ ID NO: 147); and 
HEVConsORF lN-al; CCRTCRARRCARTAGGTGCGGTC (SEQ ID NO:255). 



The new ORF 2 primer, at positions 6494-6470 of the Burmese isolate, produces a 
product 197 nucleotides in length when used in combination with HEVConsORF 2-sl; (SEQ 
ID NO: 150); as described in Examplel 1. The ORF 2 primers include: 

10 

HEVConsORF 2-sl; GACAGAATTRATTTCGTCGGCTGG (SEQ ID NO:150); and 
HEVConsORF 2N-al; CYTGYTCRTGYTGGTTRTCATAATC (SEQ ID NO:256). 



For a second round of amplification, internal primers can be used to produce products 
and 145 nucleotides in length for ORF 1 and ORF 2, respectively, as described in Example 
The new combination of ORF 1 primers include: 

HEVConsORF 1N-s2; CYGCCYTKGCGAATGCTGTGG (SEQ ID NO:257); and 
HEVConsORF l-a2; GGCAGWRTACCARCGCTGAACATC (SEQ ID NO: 178). 

The ORF 2 primers include: 

20 HEVConsORF 2-s2; GTYGTCTCRGCCAATGGCGAGC (SEQ ID NO: 1 52); and 

HEVConsORF 2N-a2; GYTCRTGYTGRTTRTCATAATCCTG (SEQ ID NO:258). 

PCR reactions contained 2 mM MgCl 2 and 0.5 jxM of each oligonucleotide primer as 
per the manufacturer's instructions (Perkin-Elmer) and amplified using Touch-down PCR as 
described in Example 5. Amplified products were separated on a 1.5% agarose gel, stained 
25 with ethidium bromide, and analyzed for the presence of PCR products of the appropriate size. 
The primers were used to detect the presence of virus in serum containing HEV as described 
above and showed a marked increase in sensitivity over previous primers sets used in Example 
1 1 . These new primer combinations were found to be more sensitive with a number of different 
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variants of HEV that included two new isolates from Argentina, Arl and Ar2, and a new isolate 
from Austria, Aul (see example 14 above), as well as isolates from Greece, Gl, and Egypt, 
Eg46. The results are presented in Table 58 below in which NT represents samples not tested, 
represents no product band detectable by ethidium bromide staining, "+/-"represents a weak 
5 product band detectable by ethidium bromide staining, and "2+" "3+" and "4+" represent 
increasing amounts of product as detected by ethidium bromide staining. 
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Equivalents 

10 The invention may be embodied in other specific forms without departing from the spirit 

or essential characteristics thereof. The foregoing embodiments are therefore to be considered in 
all respects illustrative rather than limiting on the invention described herein. Scope of the 
invention is thus indicated by the appended claims rather than by the foregoing description, and 
all changes that come within the meaning and range of equivalency of the claims are intended to 

15 be embraced therein. 
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WHAT IS CLAIMED IS: 

1 . A method of detecting the presence of a US-type or US-subtype hepatitis E virus (HEV) 
or a naturally occurring variant thereof in a test sample, the method comprising the steps of: 
5 (a) contacting the sample with a binding partner that binds specifically to a marker 

for said virus, which if present in the sample binds to the binding partner to produce a marker- 
binding partner complex, and 

(b) detecting the presence of said complex, the presence of said complex being 
indicative of the presence of said virus in the sample. 

10 2. The method of claim 1 , wherein said marker is an antibody capable of binding said 
virus. 

3. The method of claim 2, wherein said antibody is an immunoglobulin G or an 
immunoglobulin M. 

4. The method of claim 2, wherein said binding partner is an isolated polypeptide chain. 

15 5. The method of claim 4, wherein said polypeptide chain is immobilized on a solid 
support. 

6. The method of claim 4, wherein said binding partner is a polypeptide chain selected 
from the group consisting of SEQ ID NOS:91, 92, and 93, including naturally occurring 
variants thereof. 

20 7. The method of claim 4, wherein said binding partner is a polypeptide chain comprising 
the amino acid sequence set forth in SEQ ID NO: 173 or SEQ ID NO: 175. 

8. The method of claim 4, where said binding partner is a polypeptide chain comprising 
the amino acid sequence set forth in SEQ ID NO:174 or SEQ ID NO:176. 
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9. The method of claim 4, wherein said binding partner is a polypeptide chain selected 
from the group consisting of SEQ ID NOS:166, 167 and 168, including naturally occurring 
variants thereof. 

10. The method of claim 4, wherein said binding partner is a polypeptide comprising the 
5 amino acid sequence set forth in SEQ ID NO:223. 

1 1 . The method of claim 4, wherein said binding partner is a polypeptide comprising the 
amino acid sequence set forth in SEQ ID NO:224. 

12. The method of claim 1 , wherein said binding partner is an isolated antibody capable of 
binding specifically to a polypeptide chain selected from the group consisting of SEQ ID 

10 NOS:91, 92, 93, 166, 167, and 168, including naturally occurring variants thereof. 

13. The method of claim 1 2, wherein said antibody is a monoclonal antibody. 

14. The method of claim 1 , wherein said marker is a polypeptide chain. 

15. The method of claim 14, wherein said polypeptide chain is selected from the group 
consisting of SEQ ID NOS:91, 92, and 93, including naturally occurring variants thereof. 

15 16. The method of claim 1 4, wherein said polypeptide chain comprises the amino acid 
sequence set forth in SEQ ID NO:173 or SEQ ID NO:175. 

17. The method of claim 14, wherein said polypeptide chain comprises the amino acid 
sequence set forth in SEQ ID NO:174 or SEQ ID NO:176. 

1 8. The method of claim 14, wherein said polypeptide chain is selected from the group 
20 consisting of SEQ ID NOS: 1 66, 1 67, and 1 68, including naturally occurring variants thereof 

19. The method of claim 14, wherein said polypeptide chain comprises the amino acid 
sequence set forth in SEQ ID NO:223. 

20. The method of claim 14, wherein said polypeptide chain comprises the amino acid 
sequence set forth in SEQ ID NO:224. 
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21 . The method of claim 1 , wherein said marker is a nucleic acid sequence defining at least 
a portion of a genome of said virus, or a complementary strand thereof. 

22. The method of claim 1 wherein said binding partner is an isolated nucleic acid sequence 
that is capable of hybridizing under specific hybridization conditions to the nucleic acid 
sequences set forth in SEQ ID NOS:89 and 164. 

23. The method of claim 1 wherein said binding partner is selected from the group 
consisting of SEQ IDNOS:126, 128, 147, 148, 150, 152, 177, 178, 255,256, 257, and 258. 

24. The method of claim 1 wherein said binding partner is an isolated polypeptide chain. 

25. The method of claim 1 wherein said test sample is a mammalian cell line. 

26. The method of claim 41 wherein said mammalian cell line is a human fetal kidney cell 
line. 

27. A method of detecting the presence of a hepatitis E virus (HEV) in a test sample, the 
method comprising the steps of: 

(a) contacting the sample with a binding partner selected from the group consisting 
of SEQ IDNOS: 126, 128, 147, 148, 150, 152, 177, 178, 255, 256, 257, and 258 that binds 
specifically to a marker for said virus, which if present in the sample binds to the binding 
partner to produce a marker-binding partner complex, and 

(b) detecting the presence of said complex, the presence of said complex being 
indicative of the presence of said virus in the sample. 

28. An isolated polypeptide chain comprising the amino acid sequence set forth in SEQ ID 
NO:173, SEQ ID NO:174, SEQ ID NO:175, SEQ ID NO:176, SEQ ID NO:223 and SEQ ID 
NO:224. 

29. An isolated antibody capable of binding specifically to a polypeptide chain selected 
from the group consisting of a polypeptide encoded by an ORF 1 sequence of a US-type or a 
US-subtype HEV, a polypeptide encoded by an ORF 2 sequence of a US-type or a US-subtype 
HEV, and a polypeptide encoded by an ORF 3 sequence of a US-type or a US-subtype HEV. 
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30. An isolated antibody capable of binding specifically to a polypeptide chain comprising 
the amino acid sequence set forth in SEQ ID NO:173, SEQ ID NO:175 or SEQ ID NO:224. 

3 L An isolated antibody capable of binding specifically to a polypeptide chain comprising 
the amino acid sequence set forth in SEQ ID NO: 174, SEQ ID NO: 176 or SEQ ID NO:223. 

32. The isolated antibody of claim 30, wherein said antibody, under similar conditions, has 
a lower affinity for a polypeptide chain comprising the amino acid sequence set forth in SEQ 
ID NO:169 or 171. 

33. The isolated antibody of claim 3 1 , wherein said antibody, under similar conditions, has 
a lower affinity for a polypeptide chain comprising the amino acid sequence set forth SEQ ID 
NO: 170 or 172. 

34. The isolated antibody of claim 29 further comprising a detectable moiety. 

35. An isolated nucleic acid sequence defining at least a portion of an ORF 1, ORF 2 or 
ORF 3 sequence of a US-type or US-subtype hepatitis E virus, or a sequence complementary 
thereto. 

36. An isolated nucleic acid sequence capable of hybridizing under specific hybridization 
conditions to the nucleotide sequence set forth in SEQ ID NOS:89 and 164. 

37. A vector comprising the isolated nucleic acid sequence of claim 35. 

38. A host cell containing the vector of claim 37. 

39. A method of immunizing a mammal against a US-type or US-subtype HEV, the method 
comprising administering to the mammal the polypeptide of claim 28 in an amount sufficient to 
stimulate the production of an antibody capable of binding specifically to the US-type or US- 
subtype hepatitis E virus. 

40. A method of immunizing a mammal against a US-type or US-subtype HEV 1 , the 
method comprising administering to said mammal the antibody of claim 29 in an amount 
sufficient to immunize said mammal against the US-type or US-subtype hepatitis E virus. 
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41 . A method of immunizing a mammal against a US-type or US-subtype HEV 1 , the 
method comprising administering to said mammal the antibody of claim 30 in an amount 
sufficient to immunize said mammal against the US-type or US-subtype hepatitis E virus. 

42. A method of immunizing a mammal against a US-type or US-subtype HEV 1 , the 
method comprising administering to said mammal the antibody of claim 31 in an amount 
sufficient to immunize said mammal against the US-type or US-subtype hepatitis E virus. 

43. A method of immunizing a mammal against a US-type or US-subtype HEV, the method 
comprising administering to said mammal the host cell of claim 38 in an amount sufficient to 
immunize said mammal against the US-type or US-subtype hepatitis E virus. 
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ABSTRACT OF THE DISCLOSURE 

Disclosed herein are methods and compositions for detecting the presence in a sample 
of a US-type or a US-subtype hepatitis E virus, including naturally occurring variants thereof. 
In particular, the invention provides nucleic acid sequences corresponding to the genome of the 
US-type or US-subtype hepatitis E virus, amino acid sequences, including epitope sequences, 
encoded by the genomes of such viruses, and antibodies that bind specifically to such amino 
acid sequences. The invention further provides methods and compositions for immunizing 
individuals against infection by, or for treating individuals already infected with such a virus. 
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SEQUENCE LISTING 

<110> Schlauder, George G 
Erker, James C 
Desai, Suresh M 
Dawson, George J 
Mushawar, Isa K 

<120> METHODS AND COMPOSITIONS FOR DETECTING HEPATITIS E 
VIRUS 

<130> 6232. US. 01 

<140> 
<141> 

<150> US 09/173, 141 
<151> 1998-10-15 

<160> 258 

<170> Patentln Ver. 2.0 

<210> 1 

<211> 18 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer C375M 

<400> 1 

ctgaacatcc cggccgac 



<210> 2 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer A1-350M 
<400> 2 

agaaagcagc gatggagga 



<210> 3 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer S1-34M 



<400> 3 

gcccaccagt tcattaaggc t 
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<210> 4 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer A2-320M 
<400> 4 

tcattaatgg agcgtgggtg 

<210> 5 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer S2-55M 
<400> 5 

cctggcatca ctactgctat 



<210> 6 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer C375 
<400> 6 

ctgaacatca cgcccaac 



<210> 7 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer Al-350 
<400> 7 

aggaagcagc ggtggacca 

<210> 8 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer Sl-34 
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<400> 8 

gcccatcagt ttattaaggc 



20 



<210> 9 

<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer A2-320 



<210> 10 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer S2-55 
<400> 10 

cctggcatca ctactgctat 2 0 



<210> 11 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer M1PR6 



<210> 12 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer S4294M 
<400> 12 

gtgttctacg gggatgctta tgacg 2 5 



<210> 13 
<211> 25 
<212> DNA 

<213> Artificial Sequence 



<400> 9 

tcatttattg agcggggatg 



20 



<400> 11 

ccatgttcca caccgtattc cagag 



25 
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<220> 

<223> Description of Artificial Sequence: Primer M1PF6 
<400> 13 

gactcagtat tctctgctgc cgtgg 25 



<210> 14 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer A4556 
<400> 14 

ggctcaccag aatgcttctt ccaga 25 



<210> 15 
<211> 342 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> Clone: USP-15 
<400> 15 

gcccatcagt ttattaaggc tcctggcatt actactgcca ttgagcaggc tgctctggct 6 0 
gcggccaatt ctgccttggc gaatgctgtg gtggttcggc cgtttttatc tcgcgtgcaa 12 0 
accgagattc ttattaattt gatgcaaccc cggcagttgg ttttccgccc tgaggtactt 180 
tggaatcacc ctatccagcg ggttatacat aatgaattag aacagtactg ccgggctcgg 240 
gctggtcgtt gcttggaggt tggagctcac ccaagatcca ttaatgacaa ccccaacgtt 3 00 
ctgcatcggt gtttccttag accggtcggg cgtgatgttc ag 342 



<210> 16 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
PA2-5560 

<400> 16 

taggttatac tgccggcgca 2 0 



<210> 17 
<211> 20 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer Sl-5287 
<400> 17 

ttctcagccc ttcgcaatcc 



<210> 18 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer S2-5310 
<400> 18 

atattcatcc aaccaacccc 



<210> 19 
<211> 251 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> Clone b421 
<400> 19 

atattcatcc aaccaacccc ttcgccgccg atgtcgtttc acaacccggg gctggaactc 60 
gccctcgaca gccgccccgc cccctcggtt ccgcttggcg tgaccagtcc cagcgcccct 12 0 
ccgttgcccc ccgtcgtcga tctaccccag ctggggctgc gccgctaact gccatatcac 18 0 
cagcccctga tacagctcct gtacctgatg ttgactcacg tggtgctatt ttgcgccggc 24 0 
agtataacct a 251 



<210> 20 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
US4 .2-69S/20 

<400> 20 

ttccgcttgg cgtgaccagt 



<210> 21 
<211> 21 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
US4 . 4 / 144s 

<400> 21 

gctaactgcc atatcaccag c 21 



<210> 22 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer M6417a 
<400> 22 

cccttatcct gctgagcatt 2 0 



<210> 23 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer M6371a 
<400> 23 

ttggctcgcc attggctgag acaa 24 



<210> 24 
<211> 899 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> Clone df-orf2/3 
<400> 24 

gctaactgcc atatcaccag cccctgatac agctcctgta cctgatgttg actcacgtgg 6 0 
tgctattttg cgccggcagt acaatttgtc tacgtccccg cttacatcat ctgttgcttc 120 
tggtactaat ctggttctct atgctgcccc gctgaaccct ctcttgcctc ttcaggatgg 18 0 
caccaacact catattatgg ctactgaggc atctaattac gcccagtatc gggttgttcg 24 0 
ggctacgatt cgttatcgcc cgttggtgcc aaatgctgtt ggtggttatg ctatctctat 3 00 
ttctttctgg cctcaaacta caactacccc tacttctgtt gacatgaatt ctatcacttc 360 
tactgatgtc aggatcttgg tccagcccgg tatagcctcc gagttagtca tccctagtga 42 0 
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acgcct tcac 


taccgcaacc 


aaggctggcg 


ctctgttgag 


accacgggt.g 


4-- f~T <T ^3 -"i 

Cyy CLyaaya 


*± O \J 


ggaggc ta.cc 


tccggtctgg 


taatgctt tg 


tat teat ggc 


t cccc tgt ta 


4** /~i *-% 4- 5 >-i --\ 




taatacacct 


tacaccggtg 


cattggggct 


ucu tga cctE 


gcacuagaac 


4— 4— -a ^ 4— 4— 4— ^ f~r 
L» L.y dd L, L. UCiy 




aaatttgaca 


cccgggaaca 


ctaacacccg 


tgtttcccgg 


tatactagca 


cagcccgcca 


660 


ccggctgcgc 


cgcggtgctg 


atgggaccgc 


tgagctcacc 


accacagcag 


ccacacgctt 


720 


catgaaggat 


ttgcatttta 


ctggtacgaa 


cggcgttggt 


gaggtgggtc 


gtggtattgc 


780 


cctgactctg 


tttaatcttg 


ctgatacgct 


tcttggtggt 


ttaccgacag 


aattgatttc 


840 


gtcggctggg 


ggtcaactgt 


tttactcccg 


ccctgttgtc 


teagecaatg 


gcgagccaa 


899 



<210> 25 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer USP 
3s/20 

<400> 25 

tggcattact actgecattg 2 0 



<210> 26 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer M902A 
<400> 26 

ategategga catagacctc 2 0 



<210> 27 
<211> 846 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> Clone df-orfl 
<400> 27 

tggcattact actgecattg ageaggctge tctggctgcg gecaattctg ccttggcgaa 6 0 
tgctgtggtg gttcggccgt ttttatctcg cgtgcaaacc gagattctta ttaatttgat 12 0 
gcaaccccgg cagttggttt tccgccctga ggtactttgg aatcacccta tecagegggt 180 
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tatacataat gaattagaac agtactgccg ggctcgggct ggtcgttgct tggaggttgg 24 0 
agctcaccca agatccatta atgacaaccc caacgttctg catcggtgtt tccttagacc 300 
ggttggccga gatgttcagc gctggtactc tgcccccacc cgcggccctg cggctaattg 360 
ccgccgctcc gcgttgcgtg gtctcccccc cgctgaccgc acttactgct ttgatggatt 42 0 
ctcccgttgt gcttttgctg cagagaccgg tgtggctctt tactctctgc atgacctttg 48 0 
gccagctgat gttgcagagg ctatggcccg ccacgggatr acacgcttgt atgccgcact 540 
gcaccttccc cctgaggtgc tgctaccacc cggcacctac cacacaacct cgtatctcct 6 00 
gattcacgac ggcgaccgcg ctgttgtaac ttacgagggc gatactagtg cgggctataa 660 
tcatgatgtc tccatacttc gtgcgtggat ccgtactaca aaaatagttg gtgatcatcc 720 
gttggtcata gagcgtgtgc gggccattgg atgtcatttt gtgttgctgc tcaccgcagc 780 
ccctgagccg tcacccatgc cttatgttcc ttaccctcgt tcaacggagg tctatgtccg 840 
atcgat 846 



<210> 28 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 3750s 
<400> 28 

cttccatcag ttggctgagg age 23 



<210> 29 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 3900a 
<400> 29 

gccatgcggc agtgcacaat gtc 2 3 



<210> 30 
<211> 168 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> Clone HEV 167 
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<400> 30 

cttccatcag ttggctgagg agctgggcca tcgcccggcc cctgtcgccg ccgtcttgcc 6 0 
cccttgccct gagcttgagc agggcctgct ctacatgcca caggagctca ctgtgtccga 120 
tagtgtgttg gtttttgagc ttacggacat tgtgcactgc cgcatggc 168 



<210> 31 
<2ll> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 5000s 
<400> 31 

ctcgttcata acctgattgg catgc 2 5 



<210> 32 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
uf-orf2/3 a3 

<400> 32 

ggactggtca cgccaagcgg aac 23 



<210> 33 
<211> 424 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> Clone HEV 426 
<400> 33 

ctcgttcata acctgattgg catgctgcag accatcgccg atggcaaggc ccactttaca 6 0 
gagactatta aacctgtact tgatctcaca aattccatca tacagcgggt ggaatgaata 12 0 
acatgtcttt tgcatcgccc atgggatcac catgcgccct agggctgttc tgttgttgtt 18 0 
cctcatgttt ctgcctatgc tgcccgcgcc accggccggt cagccgtctg gccgtcgccg 24 0 
tgggcggcgc agcggcggtg ccggcggtgg tttctggagt gacagggttg attctcagcc 300 
cttcgccctc ccctatattc atccaaccaa ccccttcgcc gccgatgtcg tttcacaacc 36 0 
cggggctgga actcgccctc gacagccgcc ccgccccctc ggttccgctt ggcgtgacca 42 0 
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gtcc 



<210> 34 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 167-sl 
<400> 34 

tctacatgcc acaggagctc actg 



<210> 35 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 426-a3 
<400> 35 

gatggaattt gtgagatcaa gtacagg 



<210> 36 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 167-S2 
<400> 36 

ctcactgtgt ccgatagtgt gttgg 



<210> 37 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 426-a4 
<400> 37 

ccttgccatc ggcgatggtc tgc 



<210> 38 
<211> 1186 
<212> DNA 

<213> Hepatitis E virus 



<220> 
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<223> Clone HEV 1186 



<400> 38 
ctcactgtgt 


ccgatagtgt 


gttggttttt 


gagcttacgg 


atatagttca 


ttgccgcatg 


60 


gccgctccaa 


gccagcgaaa 


ggctgttctc 


tcaacacttg 


tggggaggta 


tggccgtagg 


120 


acgaaactat 


atgaggcggc 


gcattcagat 


gttcgtgagt 


ccctagctag 


gttcatccct 


180 


actatcgggc 


ctgttcaggc 


taccacatgt 


gagttgtatg 


agttggttga 


ggctatggtg 


240 


gagaaaggtc 


aggacggctc 


tgcagtctta 


gagcttgatc 


tttgtaatcg 


tgatgtctcg 


300 


cgcatcacat 


ttttccaaaa 


agwctgcaac 


aagtttacaa 


ctggtgagac 


catcgcccac 


360 


ggcaaggttg 


gccagggtat 


atcggcctgg 


agtaagacct 


tctgcgctct 


gttcggcccg 


420 


tggttccgcg 


ccattgaaaa 


agaaatattg 


gccctgctcc 


cgcctaatat 


cttttatggc 


480 


gacgcttatg 


aggagtcagt 


ttttgccgcc 


gctgtgtccg 


gggcggggtc 


atgtatggta 


540 


tttgaaaatg 


acttttcaga 


gtttgacagt 


acccagaata 


atttctctct 


tggccttgag 


600 


tgtgtggtta 


tggaggagtg 


cggcatgcct 


caatggctaa 


ttaggttgta 


ccatctggtt 


660 


cggtctgcct 


ggattctgca 


ggcgccgaag 


gagtctctta 


agggtttctg 


gaagaagcat 


720 


tctggtgagc 


ctggtaccct 


tctttggaat 


accgtctgga 


atatggcgat 


tatagcacat 


780 


tgctatgagt 


tccgtgactt 


tcgtgttgct 


gcctttaagg 


gtgatgattc 


ggtggtcctc 


840 


tgtagtgact 


accgacagag 


ccgcaatgca 


gctgccttaa 


ttgctggctg 


tgggctcaaa 


900 


ttgaaggttg 


attaccgccc 


tatcgggctg 


tatgctgggg 


tggtggtggc 


ccccggtttg 


960 


gggacactgc 


ccgatgtggt 


gcgttttgct 


ggtcggttgt 


ctgaaaagaa 


ttggggcccc 


1020 


ggcccggaac 


gtgctgagca 


gctgcgtctt 


gctgtctgcg 


acttccttcg 


agggttgacg 


1080 


aatgttgcgc 


aggtctgtgt 


tgatgttgtg 


tcccgtgtct 


atggagtcag 


ccccgggctc 


1140 


gtacataacc 


ttattggcat 


gctgcagacc 


atcgccgatg 


gcaagg 




1186 



<210> 39 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer orfl-s2 
<400> 39 

tcacccatgc cttatgttcc ttacc 25 
<210> 40 
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<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 1300a 
<400> 40 

ggcggcctgg gatgtaatca eg 22 



<210> 41 
<211> 460 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> Clone HEV 459 



<400> 41 
tcacccatgc 


cttatgttcc 


ttaccctcgt 


teaaeggagg 


tgtatgtccg 


gtccatattt 


60 


ggccctggcg 


gctccccatc 


cttgtttccg 


tcagcctgct 


ctactaaatc 


tactttccat 


120 


gctgtcccgg 


tgcatatctg 


ggateggetc 


atgctctttg 


gtgccaccct 


ggacgatcag 


180 


gcgttttgct 


gttcaegget 


catgacttac 


ctccgtggta 


ttagttacaa 


ggtcactgtc 


240 


ggcgcgcttg 


tegctaatga 


ggggtggaac 


gectctgaag 


acgctcttac 


tgeartgate 


300 


actgeagett 


atttgactat 


ttgccatcag 


cgttatctcc 


gcacccaggc 


gatatccaag 


360 


ggcatgcgcc 


ggttgggggt 


tgagcacgcc 


cagaaattta 


tcacaagact 


ctacagttgg 


420 


ctatttgaga 


agtctggccg 


tgattacatc 


ccaggccgcc 






460 



<210> 42 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 459-S2 
<400> 42 

cagaaattta tcacaagact ctacag 2 6 



<210> 43 
<211> 23 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Primer 1450a 
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<400> 43 

aacactcctg accgagccac ttc 



23 



<210> 44 
<211> 235 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> Clone HEV 216 
<400> 44 

cagaaattta tcacaagact ctacagttgg ctatttgaga agtctggccg tgattatatc 60 
cccggccgcc agcttcagtt ctatgcacag tgccgacggt ggctatctgc aggcttccac 120 
ctagacccca gggtacttgt ttttgatgag tcagtaccat gccgctgtag gacgtttttg 180 
aagaaagttg cgggtaaatt ctgctgtttt atgaagtggc tcggtcagga gtgtt 2 35 



<210> 45 
<211> 26 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> usl gap- si 



<210> 46 
<211> 24 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> usl gap-a0.5 
<400> 46 

gctgcaagac cctcacgcat gatg 24 

<210> 47 
<211> 23 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> usl gap-s2 



<400> 45 

tatagatata acaggttcac ccagcg 



26 



<400> 47 

cggattatgg ttacaccctg agg 



23 
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<210> 48 
<211> 25 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> usl gap-al 
<400> 48 

attcagttgg gtaaaacgct tctgg 25 



<210> 49 
<211> 545 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> usl-gap 
<400> 49 

cggattatgg ttacaccctg aggggttgct gggtattttc ccccctttct cccctgggca 6 0 
tatctgggag tctgcgaacc ccttttgcgg ggaggggact ttgtataccc gaacttggtc 120 
aacatctggc ttttctagtg atttctcccc ccctgaagcg gccgctcctg ctatggctgc 180 
taccccgggg ctgccccatt ctaccccacc tgttagcgat atttgggtgc taccaccgcc 24 0 
ctcagaggag tttcaggttg atgcagcacc tgtgccccct gcccctgacc ctgctggatt 3 00 
gcccggtccc gttgtgctta cccccccccc ccctccccct gtgcataagc catcaatacc 360 
cccgccttcc cgtaaccgtc gtctcctcta tacctatcct gacggcgcta aggtgtatgc 42 0 
agggtcactg tttgaatcag actgtgactg gctggttaat gcctcaaacc cgggccatcg 480 
tcccggaggt ggcctctgcc atgcctttta ccaacgtttt ccagaagcgt tttacccaac 540 
tgaat 545 



<210> 50 
<211> 24 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> usl-2600s 
<400> 50 

gtgctcacca taactgagga cacg 24 



<210> 51 
<211> 24 
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<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> usl-2600a 
<400> 51 

cgctgcatat gtaacagcaa cagg 



<210> 52 
<211> 344 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> usl-344 
<400> 52 

gtgct caeca taactgagga cacggcccgt acggccaacc tggcattgga gattgatgee 6 0 
gctacagagg tcggccgtgc ttgtgccggt tgcaccatca gccctggcat tgtgcactat 120 
cagtttaccg ccggggtccc gggcteggge aagtcaaggt ccatacaaca gggagatgtc 180 
gatgtggtgg ttgtgcccac cegggagett cgtaatagtt ggcgccgccg gggttttgcg 240 
gccttcacac cccacacagc ggcccgtgtt actatcggcc gccgcgttgt gattgatgag 300 
gctccatctc tcccgccaca cctgttgctg ttacatatgc ageg 344 



<210> 53 
<211> 23 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> usl 3200s 
<400> 53 

gccgatgtgt gcgagctcat acg 



<210> 54 
<211> 25 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> usl 3200a 
<400> 54 

atgattgtgg tctctgtgaa ggtgg 



<210> 55 
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<211> 194 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> usl-194 
<400> 55 

gccgatgtgt gcgagctcat acgcggagcc taccctaaaa tccagaccac gagccgtgtg 60 
ctacggtccc tgttttggaa tgaaccggcc attggccaga agttggttyt cacgcaggcg 120 
gcaaaggctg ctaaccctgg tgcgattacg gtccacgaag ctcagggtgc caccttcaca 18 0 
gagaccacaa teat 194 



<210> 56 
<211> 23 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> HEV216-S1 
<400> 56 

cagtaccatg ccgctgtagg acg 23 



<210> 57 
<211> 26 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> us2-733al 
<400> 57 

ccattagatg aaatctttac ctgcag 2 6 



<210> 58 
<211> 26 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> HEV216-S2 
<400> 58 

gtaggacgtt tttgaagaaa gttgcg 26 



<210> 59 
<211> 24 
<212> DNA 

<213> Hepatitis E virus 
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<220> 

<223> us2-733a2 
<400> 59 

ggtgagctca taagtgaggc tgtg 



<210> 60 
<211> 464 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> usl- 733wb 



<400> 60 
gtaggacgtt 


tttgaagaaa 


gttgcgggta 


aattctgctg 


ttttatgcgg 


tggctcgggc 


60 


aggagtgtac 


ctgcttcttg 


gagccggccg 


agggtrtagt 


cggcgatcat 


ggccatgaca 


120 


acgaggccta 


tgagggttct 


gaggtcgacc 


cggctgaacc 


tgcacatctt 


gatgtttctg 


180 


ggacttacgc 


cgtccacggg 


caccagcttg 


aggccctcta 


tagggcactt 


aatgtcccac 


240 


aagatattgc 


cgctcgagct 


tcccgactaa 


cggcaactgt 


tgagctcgtt 


gcaagtccag 


300 


accgcttaga 


gtgccgcacc 


gtgctcggta 


ataagacctt 


ccggacgacg 


gtggtcgacg 


360 


gcgcccatct 


agaggcgaat 


ggccctgagc 


agtatgtctt 


atcatttgac 


gcctcccgtc 


420 


agtctatggg 


ggccgggtcg 


cacagcctca 


cttatgagct 


cacc 




464 



<210> 61 
<211> 24 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> usl 733sl 
<400> 61 

ttgagctcgt tgcaagtcca gacc 



<210> 62 
<211> 22 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> us2851-r2 



<400> 62 

ccagaggttg accaggttcg gg 



22 
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<210> 63 
<211> 24 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> usl 733s2 
<400> 63 

ccgtgctcgg taataagacc ttcc 24 



<210> 64 
<211> 433 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> usl-432 



<400> 64 
ccgtgctcgg 


taataagacc 


ttccggacga 


cggtggtcga 


cggcgcccat 


ctagaggcga 


60 


atggccctga 


gcagtatgtc 


ttatcatttg 


acgcctcccg 


tcagtctatg 


ggggccgggt 


120 


cgcatagcct 


cacttatgag 


ctcacccctg 


ctggtttgca 


ggttaggatt 


tcatctaatg 


180 


gtctggattg 


cactgctaca 


ttcccccccg 


gtggagcccc 


tagcgctgcg 


cccggggagg 


240 


tggcagcctt 


ttgcagtgcc 


ctttatagat 


ataacaggtt 


cacccagcgg 


cactcgctga 


300 


ctggcggatt 


atggttacac 


cctgaggggt 


tgctgggtat 


tttcccccct 


ttctcccctg 


360 


ggcatatctg 


ggagtctgcg 


aacccctttt 


gcggggaggg 


gactttgtat 


acccgaacct 


420 


ggtcaacctc 


tgg 










433 



<210> 65 
<211> 26 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> us2851-fl 
<400> 65 

gactgtgatt ggttagtcaa tgcctc 26 



<210> 66 
<211> 24 
<212> DNA 

<213> Hepatitis E virus 



<220> 
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<223> usl 430-al 
<400> 66 

cgtgtcctca gttatggtga gcac 24 



<210> 67 
<211> 26 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> usl 430-a2 
<400> 67 

tattagcctc aaaccaattt gcagcg 2 6 



<210> 68 
<211> 382 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> usl-382 



<400> 68 
gactgtgatt 


ggttagtcaa 


tgcctcaaac 


ccgggccatc 


gtcccggagg 


tggcctctgc 


60 


catgcctttt 


accaacgttt 


tccagaagcg 


ttttacccaa 


ctgaattcat 


catgcgtgag 


12 0 


ggtcttgcag 


catacacctt 


gaccccgcgc 


cctatcattc 


atgcagtcgc 


tcccgattat 


180 


agggttgagc 


agaacccgaa 


gaggcttgag 


gcagcgtacc 


gtgaaacttg 


ttcccgtcgt 


240 


ggcaccgctg 


cctacccgct 


tttgggttcg 


ggtatatacc 


aggtccctgt 


tagcctcagt 


300 


tttgatgcct 


gggaacgtaa 


tcaccgcccc 


ggcgatgagc 


tttacttgac 


cgagcccgct 


360 


gcaaattggt 


ttgaggctaa 


ta 








382 



<210> 69 
<211> 22 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> US2-579-S1 
<400> 69 

cagaccacga gccgtgtgct ac 22 



<210> 70 
<211> 25 
<212> DNA 
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<213> Hepatitis E virus 
<220> 

<223> JE hevl67-al 
<400> 70 

ccaacacact atcggacaca gtgag 2 5 



<210> 71 
<211> 22 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> US2-579-S2 
<400> 71 

gctgctaagg ctgccaaccc tg 22 



<210> 72 
<211> 24 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> JE hevl67-a2 
<400> 72 

cagtgagctc ctgtggcatg taga 24 



<210> 73 
<211> 451 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> usl-579wb 



<400> 73 
gctgctaagg 


ctgccaaccc 


tggtgcgatt 


acggtccacg 


aagctcaggg 


tgccaccttc 


60 


acagagacca 


caatcatagc 


cacggccgac 


gccaggggcc 


ttatccagtc 


atcccgggct 


120 


catgctatag 


ttgcacttac 


tcgccacact 


gagaagtgtg 


ttatcctgga 


tgcccccggc 


180 


ctgcttcgtg 


aggtcggcat 


ttcggatgtg 


attgtcaaca 


actttttcct 


tgctggtggc 


240 


gaggtcggcc 


rccaccgccc 


ttctgtgata 


cctcgcggta 


accctgatca 


aaacctcggg 


300 


actttacagg 


ccttcccgcc 


gtcctgtcaa 


attagtgctt 


accatcagtt 


ggctgaggaa 


360 



ctgggccatc gcccggcccc tgtcgccgcc gtcttgcccc cttgccctga gcttgagcag 42 0 
ggcctgctct acatgccaca ggagctcact g 451 
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<210> 74 
<211> 24 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> US2-430S1 
<400> 74 

ggtatatacc aggtccctgt tagc 24 



<210> 75 
<211> 22 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> us2-482-al 



<400> 75 

ccgctgtgtg aggtgtgaag gc 



22 



<210> 76 
<211> 24 
<212> DNA 

<213> Hepatitis E virus 



<220> 

<223> us2-430s2 



<400> 76 

gttagcctca gttttgatgc ctgg 



24 



<210> 77 
<211> 23 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> us2-482-a2 
<400> 77 

gacgccagct gttacggagc tec 23 



<210> 78 
<211> 334 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<22 3> usl-43 0wb 
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<400> 78 

gttagcctca gttttgatgc ctgggaacgt aatcaccgcc ccggcgatga gctttacttg 6 0 
accgagcccg ctgcaaattg gtttgaggct aataagccgg cgcagccggt gctcaccata 12 0 
actgaggaca cggcccgtac ggccaacctg gcattggaga ttgatgccgc tacagaggtc 180 
ggccgtgctt gtgccggttg caccatcagc cctggcattg tgcactatca gtttaccgcc 24 0 
ggggtcccgg gctcgggcaa gtcaaggtcc atacaacagg gagatgtcga tgtggtggtt 3 00 
gtgcccaccc gggagctccg taacagctgg cgtc 334 



<210> 79 
<211> 23 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> US2-482-S1 



<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> JE us2-579-al 
<400> 80 

gtaatcgcac cagggttggc age 23 



<210> 81 
<211> 23 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> US2-482-S2 



<400> 79 

gatgtcgatg tggtggttgt gec 



23 



ry 



<210> 80 
<211> 23 



<400> 81 

ggagctccgt aacagctggc gtc 



23 



<210> 82 
<211> 22 
<212> DNA 



<213> Hepatitis E virus 



<220> 
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<223> JE us2-579-a2 
<400> 82 

cagggttggc agccttagca gc 22 



<210> 83 
<211> 413 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> usl-482wb 



<400> 83 
ggagctccgt 


aacagctggc 


gtcgccgggg 


ttttgcggcc 


ttcacacccc 


acacagcggc 


60 


ccgtgttact 


atcggccgcc 


gcgttgtgat 


tgatgaggct 


ccatctctcc 


cgccacacct 


120 


gttgctgtta 


catatgcagc 


gggcctcctc 


ggtccatctc 


ctcggtgacc 


caaatcagat 


180 


ccctgctatt 


gattttgagc 


acgccggcct 


ggtccctgcg 


atccgtcccg 


agcttgcgcc 


240 


aacgagctgg 


tggcrcgtta 


cacaccgttg 


cccggccgat 


gtgtgcgagc 


tcatacgcgg 


300 


agcctaccct 


aaaatccaga 


ccacgagccg 


tgtgctacgg 


tccctgtttt 


ggaatgaacc 


360 


ggccattggc 


cagaagttgg 


ttytcacgca 


ggctgctaag 


gctgccaacc 


ctg 


413 



<210> 84 
<211> 37 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: oligo dT 
adapter primer 

<400> 84 

ggccacgcgt cgactagtac tttttttttt ttttttt 3 7 



<210> 85 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: AUAP primer 
<400> 85 

ggccacgcgt cgactagtac 2 0 



<210> 86 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
df -orf 3-sl 

<400> 86 

gcgttggtga ggtgggtcgt gg 22 

<210> 87 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

:J <223> Description of Artificial Sequence: Primer 

4^ df-orf3-s2 

y'Y. 

IS: <400> 87 

U. cgcttcttgg tggtttaccg acag 24 

-L \ 

" <210> 88 

j\ <211> 960 

^ <212> DNA 

<213> Hepatitis E virus 

M> <220> 

<223> Clone HEV 3p RACE 

S-J:- 

<400> 88 

cgcttcttgg tggtttaccg acagaattga tttcgtcggc tgggggtcaa ctgttttact 6 0 

cccgccctgt tgtctcggcc aatggcgagc caacagtaaa gttatacaca tctgttgaga 12 0 

atgcgcagca agacaagggc atcaccattc cacacgacat agatttaggt gactcccgtg 18 0 

tggttatcca ggattatgat aaccagcacg aacaagatcg acctaccccg tcacctgccc 240 

cctcccgccc tttctcagtt cttcgtgcca atgatgtttt gtggctctct ctcactgccg 300 

ctgagtacgr ccagaccacg tatgggtcgt ccaccaaccc tatgtatgtc tctgatacag 360 

tcacgcttgt taatgtagcc actggtgctc aggctgttgc ccgctctctt gactggtcta 420 

aagttactct ggatggtcgc cctcttacta ccattcagca gtattctaag aaattttatg 480 

ttctcccgct tcgsgggaag ctgtcctttt gggaggctgg tacgaccaag gccggctacc 540 

cgtataatta taataccact gctagtgacc aaattttgat tgagaacgcg gccggtcacc 600 

gtgtcgccat ttctacttat accactagtt tgggtgccgg ccctacctcg atytctgcgg 660 

tcggtgtact agctccacat tcggcccttg ctgttctcga ggatactgtt gattatcctg 720 
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tarttttaat 


(73 1~ t" 1~ C t~ CIC 1 C 


ccrcr^cT'hcrl" r*cr 


cacccfctggt 


c fcer cacrcfcrtfc 


780 


gtgcattcca 


atctactatt 


gctgaacttc 


agcgtcttaa 


aatgaaggta 


ggtaaaaccc 


840 


gggagtctta 


attaattcct 


tttgtgcccc 


cttcgcagtt 


ctctttggct 


ttatttctca 


900 


tttctgcttt 


ccgcgctncc 


ctggaaaaaa 


aaaaaaaaaa 


gtactagtcg 


acgcgtggcc 


960 


<210> 89 
<211> 7202 
<212> DNA 

<213> Hepatitis E virus 










<220> 

<223> uslfull 












<400> 89 
cctggcatta 


ctactgccat 


tgagcaggct 


etc t~ c*t~ acre 1~ cr 


ccrcrc c Pi a M - p 


fcer c c 1 1 crcr ccr 
u y *- "-y y ^y 


6 0 


aatgctgtgg 


tggttcggcc 


gtttttatct 


*^y *v y o y ^ana 


(T*cracrat* t~ ct~ 

\— -v— >y ^y c * v*- 


tattaattta 


120 


atgcaacccc 


ggcagttggt 


tttccgccct 


rra ptci"! - ^ z" 1 1~ i~ 1~ 
y ay y l- ct\— u. u w 


crcr aat"P3 ccc 


i~ =s t" pp^nprrrr 


180 


gttatacata 


atgaattaga 


acagtactgc 




^yy l. v_,y l. i-y 


/ — • -| — -1 — nrra crcr I - i - 

^ »- L -yy ci yy u u 


24 0 


ggagctcacc 


caagatccat 


taatgacaac 


Ot^OCLCLL-y L-l—V— 


L-y ^d.L»uy y L.y 




3 0 0 


ccggttggcc 


gagatgttca 


gcgctggtac 


L- o i~y ^^vLud 




■f - nrr i (T , fTr , 1 - pat - 

y yy l^clcl 


360 


tgccgccgct 


ccgcgttgcg 


tggtctcccc 


r i r i r i rrr , 't - rrpj pp 

V ■ »w V/ ^— j 


cr r* a p 1 1 a r* t cr 


r*t* t* t cratcrcra 


420 


ttctcccgtt 


gtgcttttgc 


tgcagagacc 


CTCt\~ (*T "f - fTfT (*"• "t~ f 

yy L y "-yy"- 1 -^ 




rr r 1 "h rr^ r* 1~ "h 
y Vv cl L^y auVy o 


4 8 0 


tggccagctg 


atgttgcaga 


ggctatggcc 


cyccacgyga 


\ — k* a r~< prrr 1 i~ +~ 
L- J- cH_c4^y^- I— L- 


y uct L-y k^v^y l-ci 


54 0 


ctgcaccttc 


cccctgaggt 


gctgctacca 


^ cl <^ L- 


ctu d- ^ 0- - ctci ^ 


r~» t" r'n't - ahpfp 
v«- u^y i^cl ^ — w i_ v — 


6 0 0 


ctgattcacg 


acggcgaccg 


cgctgttgta 


^ r*t* t* ^ r^rr^ crrr 
d w i— o c*. ^ — y y y 


^-j v^y cl L-Ci ciy 


f- C7 CCICTCf Cf~ F\ Y~ 

i-- y ^y y y ^ l^o. o 


66 0 


aatcatgatg 


tctccatact 


tcgtgcgtgg 


3fpprr1"3p1"^ 


faaaaahanh 

V^CLCLCLCICL I- Ciy ' 


f- rrcr t" cr?} fpaf 

L^-y y L-*y cl o v • cl 


72 0 


ccgttggtca 


tagagcgtgt 


gcgggccatt 


y y CL L V^CL i_- 


t - 1 - cr t" eft - 1 cr r* f 


y ^ ci w w y *w cl 


780 


gcccctgagc 


cgtcacccat 


gccttatgtt 


k*— CL \— - 


^ CLCL >H CL 


rrrrt" 0"t~ £3 t* cr1~ r* 
y y w ci i y 


840 


cggtccatat 


ttggccctgg 


cggctcccca 




r*cr1~ Cricrc cf" a 


phpi-a r*t" aaa 


900 


tctactttcc 


atgctgtccc 


ggtgcatatc 


t crcr era t ccrcrc 


teat get ctt 


tggtgccacc 


960 


ctggacgatc 


aggcgttttg 


ctgttcacgg 


c tcatgactfc 


acc tccgtgg 


tattagttac 


1020 


aaggtcactg 


tcggcgcgct 


tgtcgctaat 


gaggggtgga 


acgcctctga 


agaegctett 


1080 


actgcartga 


tcactgcagc 


ttatttgact 


atttgecate 


agegttatet 


ccgcacccag 


1140 


gcgatatcca 


agggcatgcg 


ccggttgggg 


gttgagcacg 


cccagaaatt 


tatcacaaga 


1200 
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v l_ L LdLdy L L 


rrprp4- = 4- 4- 4- n~a 


y ci cty LL-LyyL. 


pn"t" rra t" tr a t" a 
L-y Ly ct l LctLci 


4- pp ccncrc CCT 

LLLLLyyLLy 


L l cty LLLL cty 


1260 


ccc^^vctc^c 

L. ci i_» y v — C4. *w 


01 y L,y l dLy 


ct t~ act c f~ .3. f~ e~* f~ 
y i— y y v_. i — ci. v., u 


npfl csctc t~ cc 
y v>ay y i— i— 


pp 4- ^ rra p p p 


p a rrrTrri - apft" 
wcty y y k- . ex. l>i< l> 


132 0 


erf- +■ f - 1- +/- rra +- rr 
y uuul. Ly ct Ly 


a p/4- p a pt4- a p p 
cty LLdy LdL L 


a Ly LLy l Ly l 


arr/Ta prf f" 4—4—4- 

cty y ctv_.y l l l l 


f-praa ct a a a rr I - 
Ly ctcty ci ctcty l 


4- p- p r~\r-\cr \~ a a a 

Ly Ly yy Lctctct 


i "3 R n 




1_ ct *— y Ly y — y 


y ' — LLy y y l- cty 


y cty Ly l ct l- l 


y 1_- 1_- v l lm y ct 


CT C C* CTCT C* C CT a pr 

y ^^yy^^-ydy 


144 0 


prpr 4~ 4~ t~ a rrl - r^rr 

ggt Ltdgucg 


rrprra f pa f- p/pr 

yegaucaugg 


ppa^rrapaap 
O L cl Ly ct Octet O 


pra prprpi »-i 4- -a 4- r-\ 

yaggee uaty 


a prp-prt - 4- p 4— pra 


p*pj"4- ppra C* C 1 C*CT 

ggtcgacccg 


i Ron 


prp 4- pra a f i"" 1 "h" pr 

yLLyddLL Ly 


rapah p 4- f-n*a 
CaLclLL. 1- uy ci 


Ly iLiL Ly y y 


anhha pp;p (~>ct 
CtL- L L CtLy L- L-y 


4- p p — j prrrrrrpa 

LLLdLyyy Ld 
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cccgattata 


gggttgagca 


gaacccgaag 


aggcttgagg 


cagcgtaccg 


tgaaacttgt 


2640 


tcccgtcgtg 


gcaccgctgc 


ctacccgctt 


ttgggttcgg 


gtatatacca 


ggtccctgtt 


2700 


agectcagtt 


ttgatgcctg 


ggaacgtaat 


caccgccccg 


gegatgaget 


ttacttgacc 


2760 


gagcccgctg 


caaattggtt 


tgaggctaat 


aagccggcgc 


ageeggtget 


caccataact 


2820 


gaggacaegg 


cccgtacggc 


caacctggca 


ttggagattg 


atgccgctac 


agaggtegge 


2880 
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cgtgcttgtg 


ccggttgcac 


catcagccct 


ggcattgtgc 


actatcagtt 


taccgccggg 


2940 




gtcccgggct 


cgggcaagtc 


aaggtccata 


caacagggag 


atgtcgatgt 


ggtggttgtg 


3000 




cccacccggg 


agcttcgtaa 


tagttggcgc 


cgccggggtt 


ttgcggcctt 


cacaccccac 


3060 




acagcggccc 


gtgttactat 


cggccgccgc 


gttgtgattg 


atgaggctcc 


atctctcccg 


3120 




ccacacctgt 


tgctgttaca 


tatgcagcgg 


gcctcctcgg 


tccatctcct 


cggtgaccca 


3180 




aatcagatcc 


ctgctattga 


ttttgagcac 


gccggcctgg 


tccctgcgat 


ccgtcccgag 


3240 




cttgcgccaa 


cgagctggtg 


gcrcgttaca 


caccgttgcc 


cggccgatgt 


gtgcgagctc 


3300 




atacgcggag 


cctaccctaa 


aatccagacc 


acgagccgtg 


tgctacggtc 


cctgttttgg 


3360 




aatgaaccgg 


ccattggcca 


gaagttggtt 


ytcacgcagg 


cggcaaaggc 


tgctaaccct 


3420 


— K}. 

Li i; 


ggtgcgatta 


cggtccacga 


agctcagggt 


gccaccttca 


cagagaccac 


aatcatagcc 


3480 




acggccgacg 


ccaggggcct 


tatccagtca 


tcccgggctc 


atgctatagt 


tgcacttact 


3540 




cgccacactg 


agaagtgtgt 


tatcctggat 


gcccccggcc 


tgcttcgtga 


ggtcggcatt 


3600 


rU: 


tcggatgtga 


ttgtcaacaa 


ctttttcctt 


gctggtggcg 


aggtcggccr 


ccaccgccct 


3660 


tctgtgatac 


ctcgcggtaa 


ccctgatcaa 


aacctcggga 


ctttacaggc 


cttcccgccg 


3720 




tcctgtcaaa 


ttagtgctta 


ccatcagttg 


gctgaggaac 


tgggccatcg 


cccggcccct 


3780 




gtcgccgccg 


tcttgccccc 


ttgccctgag 


cttgagcagg 


gcctgctcta 


catgccacag 


3840 




gagctcactg 


tgtccgatag 


tgtgttggtt 


tttgagctta 


cggatatagt 


tcattgccgc 


3900 




atggccgctc 


caagccagcg 


aaaggctgtt 


ctctcaacac 


ttgtggggag 


gtatggccgt 


3960 




aggacgaaac 


tatatgaggc 


ggcgcattca 


gatgttcgtg 


agtccctagc 


taggttcatc 


4020 




cctactatcg 


ggcctgttca 


ggctaccaca 


tgtgagttgt 


atgagttggt 


tgaggctatg 


4080 




gtggagaaag 


gtcaggacgg 


ctctgcagtc 


ttagagcttg 


atctttgtaa 


tcgtgatgtc 


4140 




tcgcgcatca 


catttttcca 


aaaagwctgc 


aacaagttta 


caactggtga 


gaccatcgcc 


4200 




cacggcaagg 


ttggccaggg 


tatatcggcc 


tggagtaaga 


ccttctgcgc 


tctgttcggc 


4260 




ccgtggttcc 


gcgccattga 


aaaagaaata 


ttggccctgc 


tcccgcctaa 


tatcttttat 


4320 




ggcgacgctt 


atgaggagtc 


agtttttgcc 


gccgctgtgt 


ccggggcggg 


gtcatgtatg 


4380 




gtatttgaaa 


atgacttttc 


agagtttgac 


agtacccaga 


ataatttctc 


tcttggcctt 


4440 




gagtgtgtgg 


ttatggagga 


gtgcggcatg 


cctcaatggc 


taattaggtt 


gtaccatctg 


4500 




gttcggtctg 


cctggattct 


gcaggcgccg 


aaggagtctc 


ttaagggttt 


ctggaagaag 


4560 




cattctggtg 


agcctggtac 


ccttctttgg 


aataccgtct 


ggaatatggc 


gattatagca 


4620 
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agcggcggtg 


ccggcggtgg 


cttcgccctc 


ccctatattc 


atccaaccaa 


cggggctgga 


actcgccctc 


gacagccgcc 


gt ccaagcgc 


ccct ccgttg 


ccccccgtcg 


ddCty"CCdtd 


tcaccagccc 


ctgatacagc 


tat tt tgcgc 


cggcagtaca 


atttgtctac 


tactaat ctg 


gt t ct ctatg 


ctgccccgct 


OddCdCttaL 


at tatggc ta 


ctgaggcatc 


Latyd L LLy U 


t at cgcccgt 


tggtgccaaa 
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caaac t acaa 


c tacccctac 


tgaugucagg 
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cgcaaccaag 


gctggcgctc 
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ggucugguaa 
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ugcuuuguau 


tacaccttac 


aceggtgeat 


tggggcttct 


tttgacaccc 


gggaacacta 


acacccgtgt 


gctgcgccgc 


ggtgctgatg 


ggaccgctga 


gaaggatttg 


cattttactg 


gtacgaaegg 


gactctgttt 


aatcttgctg 


ataegcttet 



getgecttta agggtgatga ttcggtggtc 4680 
gcagctgcct taattgctgg ctgtgggctc 4 74 0 
ctgtatgctg gggtggtggt ggcccccggt 48 00 
gctggtcggt tgtctgaaaa gaattggggc 4 86 0 
ettgetgtet gcgacttcct tcgagggttg 4 92 0 
gtgtcccgtg tctatggagt cagccccggg 4 980 
accatcgccg atggcaaggc ccactttaca 5040 
aattccatca tacagegggt ggaatgaata 510 0 
catgcgccct agggctgttc tgttgttgtt 516 0 
accggccggt cagccgtctg gccgtcgccg 5220 
tttctggagt gacagggttg attctcagcc 52 80 
ccccttcgcc gecgatgteg tttcacaacc 5340 
ccgccccctc ggttccgctt ggcgtgacca 5400 
tcgatctacc ccagctgggg ctgcgccgct 5460 
tcctgtacct gatgttgact cacgtggtgc 5520 
gtccccgctt acatcatctg ttgcttctgg 5580 
gaaccctctc ttgcctcttc aggatggcac 5640 
taattacgee cagtateggg ttgttcgggc 5 700 
tgctgttggt ggttatgcta tctctatttc 5760 
ttctgttgac atgaattcta tcacttctac 5820 
agcctccgag ttagtcatcc ctagtgaacg 5880 
tgttgagacc acgggtgtgg ccgaagagga 5 940 
tcauggctcc cctgttaact cctacactaa 6000 
tgattttgea ttagaacttg aatttagaaa 6060 
ttcceggtat actagcacag cccgccaccg 6120 
gctcaccacc acagcagcca cacgcttcat 6180 
cgttggtgag gtgggtcgtg gtattgeect 6240 
tggtggttta ccgacagaat tgatttcgtc 63 00 
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ggctgggggt 


caactgtttt 


actcccgccc 




aaagttatac 


acatctgttg 


agaatgcgca 




catagattta 


ggtgactccc 


gtgtggttat 




tcgacctacc 


ccgtcacctg 


ccccctcccg 




tttgtggctc 


tctctcactg 


ccgctgagta 




ccctatgtat 


gtctctgata 


cagtcacgct 




tgcccgctct 


cttgactggt 


ctaaagttac 




gcagtattct 


aagaaatttt 


atgttctccc 


C3-- 


tggtacgacc 


aaggccggct 


acccgtataa 




gattgagaac 


gcggccggtc 


accgtgtcgc 


h 


cggccctacc 


tcgatytctg 


cggtcggtgt 


* r -:. 

S! 


cgaggatact 


gttgattatc 


ctgctcgtgc 




tcgcaccctt 


ggtctgcagg 


gttgtgcatt 




taaaatgaag 


gtaggtaaaa 


cccgggagtc 


ru. 


gttctctttg 


gctttatttc 


tcatttctgc 




aa 

<210> 90 
<211> 7202 
<212> DNA 







<213> Hepatitis E virus 

<220> 

<221> CDS 

<222> (1) . . (5097) 

<223> orfl 

<220> 
<221> CDS 

<222> (5132) . . (7114) 
<223> orf2 

<220> 

<221> misc_feature 
<222> () . . ) 
<223> CDS- orf3 

<220> 

<223> uslfull 



tgttgtctcg gccaatggcg agccaacagt 63 60 
gcaagacaag ggcatcacca ttccacacga 6420 
ccaggattat gataaccagc acgaacaaga 64 80 
ccctttctca gttcttcgtg ccaatgatgt 6540 
cgrccagacc acgtatgggt cgtccaccaa 66 00 
tgttaatgta gccactggtg ctcaggctgt 6660 
tctggatggt cgccctctta ctaccattca 6720 
gcttcgsggg aagctgtcct tttgggaggc 6 780 
ttataatacc actgctagtg accaaatttt 6840 
catttctact tataccacta gtttgggtgc 6 900 
actagctcca cattcggccc ttgctgttct 6 960 
ccatactttt gatgatttct gcccggagtg 7020 
ccaatctact attgctgaac ttcagcgtct 7080 
ttaattaatt ccttttgtgc ccccttcgca 7140 
tttccgcgct ccctggaaaa aaaaaaaaaa 7200 

7202 



<400> 90 
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cct ggc att act act gcc att gag cag get get ctg get gcg gec aat 48 
Pro Gly He Thr Thr Ala He Glu Gin Ala Ala Leu Ala Ala Ala Asn 
15 10 15 

tct gcc ttg gcg aat get gtg gtg gtt egg ccg ttt tta tct cgc gtg 96 
Ser Ala Leu Ala Asn Ala Val Val Val Arg Pro Phe Leu Ser Arg Val 
20 25 30 

caa ace gag att ctt att aat ttg atg caa ccc egg cag ttg gtt ttc 144 
Gin Thr Glu He Leu He Asn Leu Met Gin Pro Arg Gin Leu Val Phe 
35 40 45 

cgc cct gag gta ctt tgg aat cac cct ate cag egg gtt ata cat aat 192 
Arg Pro Glu Val Leu Trp Asn His Pro He Gin Arg Val He His Asn 
50 55 60 

gaa tta gaa cag tac tgc egg get egg get ggt cgt tgc ttg gag gtt 24 0 
™; Glu Leu Glu Gin Tyr Cys Arg Ala Arg Ala Gly Arg Cys Leu Glu Val 

65 70 75 80 

El 9"9 a 9" ct cac cca aga tec att aat gac aac ccc aac gtt ctg cat egg 288 

Gly Ala His Pro Arg Ser He Asn Asp Asn Pro Asn Val Leu His Arg 

tgt ttc ctt aga ccg gtt ggc cga gat gtt cag cgc tgg tac tct gcc 33 6 
" .- Cys Phe Leu Arg Pro Val Gly Arg Asp Val Gin Arg Trp Tyr Ser Ala 

^ 100 105 110 

«; i 

! y . ccc acc cgc ggc cct gcg get aat tgc cgc cgc tec gcg ttg cgt ggt 3 84 

H Pro Thr Arg Gly Pro Ala Ala Asn Cys Arg Arg Ser Ala Leu Arg Gly 

y3i 115 12 ° 125 

ctc ccc ccc get gac cgc act tac tgc ttt gat gga ttc tec cgt tgt 4 32 
Leu Pro Pro Ala Asp Arg Thr Tyr Cys Phe Asp Gly Phe Ser Arg Cys 
130 135 140 

get ttt get gca gag acc ggt gtg get ctt tac tct ctg cat gac ctt 4 80 
Ala Phe Ala Ala Glu Thr Gly Val Ala Leu Tyr Ser Leu His Asp Leu 
145 150 155 160 

tgg cca get gat gtt gca gag get atg gcc cgc cac ggg atr aca cgc 52 8 
Trp Pro Ala Asp Val Ala Glu Ala Met Ala Arg His Gly Xaa Thr Arg 
165 170 175 

ttg tat gcc gca ctg cac ctt ccc cct gag gtg ctg eta cca ccc ggc 576 
Leu Tyr Ala Ala Leu His Leu Pro Pro Glu Val Leu Leu Pro Pro Gly 
180 185 190 

acc tac cac aca acc teg tat etc ctg att cac gac ggc gac cgc get 624 
Thr Tyr His Thr Thr Ser Tyr Leu Leu He His Asp Gly Asp Arg Ala 
195 200 205 

gtt gta act tac gag ggc gat act agt gcg ggc tat aat cat gat gtc 6 72 
Val Val Thr Tyr Glu Gly Asp Thr Ser Ala Gly Tyr Asn His Asp Val 
210 215 220 



tec ata ctt cgt gcg tgg ate cgt act aca aaa ata gtt ggt gat cat 



720 
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Ser lie Leu Arg Ala Trp lie Arg Thr Thr Lys lie Val Gly Asp His 
225 230 235 240 

ccg ttg gtc ata gag cgt gtg egg gec att gga tgt cat ttt gtg ttg 768 
Pro Leu Val lie Glu Arg Val Arg Ala lie Gly Cys His Phe Val Leu 
245 250 255 

ctg etc acc gca gec cct gag ccg tea ccc atg cct tat gtt cct tac 816 
Leu Leu Thr Ala Ala Pro Glu Pro Ser Pro Met Pro Tyr Val Pro Tyr 
260 265 270 

cct cgt tea acg gag gtg tat gtc egg tec ata ttt ggc cct ggc ggc 864 
Pro Arg Ser Thr Glu Val Tyr Val Arg Ser lie Phe Gly Pro Gly Gly 
275 280 285 

tec cca tec ttg ttt ccg tea gec tgc tct act aaa tct act ttc cat 912 
Ser Pro Ser Leu Phe Pro Ser Ala Cys Ser Thr Lys Ser Thr Phe His 
290 295 300 

get gtc ccg gtg cat ate tgg gat egg etc atg etc ttt ggt gee acc 960 
Ala Val Pro Val His lie Trp Asp Arg Leu Met Leu Phe Gly Ala Thr 
305 310 315 320 

ctg gac gat cag gcg ttt tgc tgt tea egg etc atg act tac etc cgt 1008 
Leu Asp Asp Gin Ala Phe Cys Cys Ser Arg Leu Met Thr Tyr Leu Arg 
325 330 335 

ggt att agt tac aag gtc act gtc ggc gcg ctt gtc get aat gag ggg 1056 
Gly lie Ser Tyr Lys Val Thr Val Gly Ala Leu Val Ala Asn Glu Gly 
340 345 350 

tgg aac gee tct gaa gac get ctt act gca rtg ate act gca get tat 1104 
Trp Asn Ala Ser Glu Asp Ala Leu Thr Ala Xaa lie Thr Ala Ala Tyr 
355 360 365 

ttg act att tgc cat cag cgt tat etc cgc acc cag gcg ata tec aag 1152 
Leu Thr lie Cys His Gin Arg Tyr Leu Arg Thr Gin Ala lie Ser Lys 
370 375 380 

ggc atg cgc egg ttg ggg gtt gag cac gec cag aaa ttt ate aca aga 1200 
Gly Met Arg Arg Leu Gly Val Glu His Ala Gin Lys Phe lie Thr Arg 
385 390 395 400 

etc tac agt tgg eta ttt gag aag tct ggc cgt gat tat ate ccc ggc 1248 
Leu Tyr Ser Trp Leu Phe Glu Lys Ser Gly Arg Asp Tyr lie Pro Gly 
405 410 415 

cgc cag ctt cag ttc tat gca cag tgc cga egg tgg eta tct gca ggc 1296 
Arg Gin Leu Gin Phe Tyr Ala Gin Cys Arg Arg Trp Leu Ser Ala Gly 
420 425 430 

ttc cac eta gac ccc agg gta ctt gtt ttt gat gag tea gta cca tgc 1344 
Phe His Leu Asp Pro Arg Val Leu Val Phe Asp Glu Ser Val Pro Cys 
435 440 445 

cgc tgt agg acg ttt ttg aag aaa gtt gcg ggt aaa ttc tgc tgt ttt 13 92 
Arg Cys Arg Thr Phe Leu Lys Lys Val Ala Gly Lys Phe Cys Cys Phe 
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450 455 460 

atg egg tgg etc ggg cag gag tgt acc tgc ttc ttg gag ccg gec gag 1440 
Met Arg Trp Leu Gly Gin Glu Cys Thr Cys Phe Leu Glu Pro Ala Glu 
465 470 475 480 

ggt tta gtc ggc gat cat ggc cat gac aac gag gec tat gag ggt tct 1488 
Gly Leu Val Gly Asp His Gly His Asp Asn Glu Ala Tyr Glu Gly Ser 
485 490 495 

gag gtc gac ccg get gaa cct gca cat ctt gat gtt tct ggg act tac 1536 
Glu Val Asp Pro Ala Glu Pro Ala His Leu Asp Val Ser Gly Thr Tyr 
500 505 510 

gec gtc cac ggg cac cag ctt gag gec etc tat agg gca ctt aat gtc 1584 
Ala Val His Gly His Gin Leu Glu Ala Leu Tyr Arg Ala Leu Asn Val 
O 515 520 525 

p cca caa gat att gee get cga get tec cga eta acg gca act gtt gag 1632 

til; Pro Gin Asp He Ala Ala Arg Ala Ser Arg Leu Thr Ala Thr Val Glu 

[J 530 535 540 

W etc gtt gca agt cca gac cgc tta gag tgc cgc acc gtg etc ggt aat 168 0 

Leu Val Ala Ser Pro Asp Arg Leu Glu Cys Arg Thr Val Leu Gly Asn 
~ V 545 550 555 560 

^ aag acc ttc egg acg acg gtg gtc gac ggc gee cat eta gag gcg aat 1728 

FU- Lys Thr Phe Arg Thr Thr Val Val Asp Gly Ala His Leu Glu Ala Asn 

Hi 565 570 575 

ggc cct gag cag tat gtc tta tea ttt gac gee tec cgt cag tct atg 1776 
Gly Pro Glu Gin Tyr Val Leu Ser Phe Asp Ala Ser Arg Gin Ser Met 
580 585 590 

ggg gee ggg teg cat age etc act tat gag etc acc cct get ggt ttg 1824 
Gly Ala Gly Ser His Ser Leu Thr Tyr Glu Leu Thr Pro Ala Gly Leu 
595 600 605 

cag gtt agg att tea tct aat ggt ctg gat tgc act get aca ttc ccc 1872 
Gin Val Arg He Ser Ser Asn Gly Leu Asp Cys Thr Ala Thr Phe Pro 
610 615 620 

ccc ggt gga gee cct age get gcg ccc ggg gag gtg gca gee ttt tgc 192 0 
Pro Gly Gly Ala Pro Ser Ala Ala Pro Gly Glu Val Ala Ala Phe Cys 
625 630 635 640 

agt gee ctt tat aga tat aac agg ttc acc cag egg cac teg ctg act 1968 
Ser Ala Leu Tyr Arg Tyr Asn Arg Phe Thr Gin Arg His Ser Leu Thr 
645 650 655 

99° 99& tta t99 tta cac cct gag ggg ttg ctg ggt att ttc ccc cct 2 016 
Gly Gly Leu Trp Leu His Pro Glu Gly Leu Leu Gly He Phe Pro Pro 
660 665 670 



sU- : 



ttc tec cct ggg cat ate tgg gag tct gcg aac ccc ttt tgc ggg gag 
Phe Ser Pro Gly His He Trp Glu Ser Ala Asn Pro Phe Cys Gly Glu 
675 680 685 
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ggg act ttg tat acc cga act tgg tea aca tct ggc ttt tct agt gat 2112 
Gly Thr Leu Tyr Thr Arg Thr Trp Ser Thr Ser Gly Phe Ser Ser Asp 
690 695 700 

ttc tec ccc cct gaa gcg gec get cct get atg get get acc ccg ggg 2160 
Phe Ser Pro Pro Glu Ala Ala Ala Pro Ala Met Ala Ala Thr Pro Gly 
705 710 715 720 

ctg ccc cat tct acc cca cct gtt age gat att tgg gtg eta cca ccg 22 08 
Leu Pro His Ser Thr Pro Pro Val Ser Asp lie Trp Val Leu Pro Pro 
725 730 735 

ccc tea gag gag ttt cag gtt gat gca gca cct gtg ccc cct gec cct 2256 
Pro Ser Glu Glu Phe Gin Val Asp Ala Ala Pro Val Pro Pro Ala Pro 
740 745 750 

gac cct get gga ttg ccc ggt ccc gtt gtg ctt acc ccc ccc ccc cct 23 04 
Asp Pro Ala Gly Leu Pro Gly Pro Val Val Leu Thr Pro Pro Pro Pro 
755 760 765 

ccc cct gtg cat aag cca tea ata ccc ccg cct tec cgt aac cgt cgt 2352 
Pro Pro Val His Lys Pro Ser lie Pro Pro Pro Ser Arg Asn Arg Arg 
770 775 780 

etc etc tat acc tat cct gac ggc get aag gtg tat gca ggg cca ctg 2400 
Leu Leu Tyr Thr Tyr Pro Asp Gly Ala Lys Val Tyr Ala Gly Ser Leu 
785 790 795 800 

ttt gaa tea gac tgt gac tgg ctg gtt aat gec tea aac ccg ggc cat 2448 
Phe Glu Ser Asp Cys Asp Trp Leu Val Asn Ala Ser Asn Pro Gly His 
805 810 815 

cgt ccc gga ggt ggc etc tgc cat gee ttt tac caa cgt ttt cca gaa 24 96 
Arg Pro Gly Gly Gly Leu Cys His Ala Phe Tyr Gin Arg Phe Pro Glu 
820 825 830 

gcg ttt tac cca act gaa ttc ate atg cgt gag ggt ctt gca gca tac 2544 
Ala Phe Tyr Pro Thr Glu Phe lie Met Arg Glu Gly Leu Ala Ala Tyr 
835 840 845 

acc ttg acc ccg cgc cct ate att cat gca gtc get ccc gat tat agg 2592 
Thr Leu Thr Pro Arg Pro lie lie His Ala Val Ala Pro Asp Tyr Arg 
850 855 860 

gtt gag cag aac ccg aag agg ctt gag gca gcg tac cgt gaa act tgt 2 640 
Val Glu Gin Asn Pro Lys Arg Leu Glu Ala Ala Tyr Arg Glu Thr Cys 
865 870 875 880 

tec cgt cgt ggc acc get gee tac ccg ctt ttg ggt teg ggt ata tac 2688 
Ser Arg Arg Gly Thr Ala Ala Tyr Pro Leu Leu Gly Ser Gly lie Tyr 
885 890 895 

cag gtc cct gtt age etc agt ttt gat gec tgg gaa cgt aat cac cgc 2736 
Gin Val Pro Val Ser Leu Ser Phe Asp Ala Trp Glu Arg Asn His Arg 
900 905 910 
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ccc ggc gat gag ctt tac ttg acc gag ccc get gca aat tgg ttt gag 2 784 
Pro Gly Asp Glu Leu Tyr Leu Thr Glu Pro Ala Ala Asn Trp Phe Glu 
915 920 925 

get aat aag ccg gcg cag ccg gtg etc acc ata act gag gac acg gec 2 832 
Ala Asn Lys Pro Ala Gin Pro Val Leu Thr He Thr Glu Asp Thr Ala 
930 935 940 

cgt acg gec aac ctg gca ttg gag att gat gec get aea gag gtc ggc 2880 
Arg Thr Ala Asn Leu Ala Leu Glu He Asp Ala Ala Thr Glu Val Gly 
945 950 955 960 

cgt get tgt gee ggt tgc acc ate age cct ggc att gtg cac tat cag 2 92 8 
Arg Ala Cys Ala Gly Cys Thr He Ser Pro Gly He Val His Tyr Gin 
965 970 975 

ttt acc gec ggg gtc ccg ggc teg ggc aag tea agg tec ata caa cag 2 976 
Phe Thr Ala Gly Val Pro Gly Ser Gly Lys Ser Arg Ser He Gin Gin 
980 985 990 

gga gat gtc gat gtg gtg gtt gtg ccc acc egg gag ctt cgt aat agt 3 024 
Gly Asp Val Asp Val Val Val Val Pro Thr Arg Glu Leu Arg Asn Ser 
995 1000 1005 



5 tgg cgc cgc egg ggt ttt gcg gee ttc aca ccc cac aea gcg gec cgt 3 072 

M: Trp Arg Arg Arg Gly Phe Ala Ala Phe Thr Pro His Thr Ala Ala Arg 

fij 1010 1015 1020 

m 

^ gtt act ate ggc cgc cgc gtt gtg att gat gag get cca tct etc ccg 312 0 

Val Thr He Gly Arg Arg Val Val He Asp Glu Ala Pro Ser Leu Pro 
H' 1025 1030 1035 1040 

cca cac ctg ttg ctg tta cat atg cag egg gec tec teg gtc cat etc 3168 
Pro His Leu Leu Leu Leu His Met Gin Arg Ala Ser Ser Val His Leu 
1045 1050 1055 

etc ggt gac cca aat cag ate cct get att gat ttt gag cac gee ggc 3216 
Leu Gly Asp Pro Asn Gin He Pro Ala He Asp Phe Glu His Ala Gly 
1060 1065 1070 

ctg gtc cct gcg ate cgt ccc gag ctt gcg cca acg age tgg tgg ere 3264 
Leu Val Pro Ala He Arg Pro Glu Leu Ala Pro Thr Ser Trp Trp Xaa 
1075 1080 1085 

gtt aca cac cgt tgc ccg gee gat gtg tgc gag etc ata cgc gga gee 3 312 
Val Thr His Arg Cys Pro Ala Asp Val Cys Glu Leu He Arg Gly Ala 
1090 1095 1100 

tac cct aaa ate cag acc acg age cgt gtg eta egg tec ctg ttt tgg 3360 
Tyr Pro Lys He Gin Thr Thr Ser Arg Val Leu Arg Ser Leu Phe Trp 
1105 1110 1115 1120 

aat gaa ccg gec att ggc cag aag ttg gtt ytc acg cag gcg gca aag 3408 
Asn Glu Pro Ala He Gly Gin Lys Leu Val Xaa Thr Gin Ala Ala Lys 
1125 1130 1135 



get get aac cct ggt gcg att acg gtc cac gaa get cag ggt gee acc 
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Ala Ala Asn Pro Gly Ala lie Thr Val His Glu Ala Gin Gly Ala Thr 
1140 1145 1150 

ttc aca gag acc aca ate ata gec acg gec gac gec agg- ggc ctt ate 3 504 
Phe Thr Glu Thr Thr lie lie Ala Thr Ala Asp Ala Arg Gly Leu lie 
1155 1160 1165 

cag tea tec egg get cat get ata gtt gca ctt act cgc cac act gag 3 552 
Gin Ser Ser Arg Ala His Ala lie Val Ala Leu Thr Arg His Thr Glu 
1170 1175 1180 

aag tgt gtt ate ctg gat gee ccc ggc ctg ctt cgt gag gtc ggc att 3 6 00 
Lys Cys Val lie Leu Asp Ala Pro Gly Leu Leu Arg Glu Val Gly lie 
1185 1190 1195 1200 

teg gat gtg att gtc aac aac ttt ttc ctt get ggt ggc gag gtc ggc 3648 
Ser Asp Val lie Val Asn Asn Phe Phe Leu Ala Gly Gly Glu Val Gly 
1205 1210 1215 

crc cac cgc cct tct gtg ata cct cgc ggt aac cct gat caa aac etc 3696 
Xaa His Arg Pro Ser Val lie Pro Arg Gly Asn Pro Asp Gin Asn Leu 
1220 1225 1230 

ggg act tta cag gec ttc ccg ccg tec tgt caa att agt get tac cat 3 744 
Gly Thr Leu Gin Ala Phe Pro Pro Ser Cys Gin lie Ser Ala Tyr His 
1235 1240 1245 

cag ttg get gag gaa ctg ggc cat cgc ccg gee cct gtc gec gec gtc 3 792 
Gin Leu Ala Glu Glu Leu Gly His Arg Pro Ala Pro Val Ala Ala Val 
1250 1255 1260 

ttg ccc cct tgc cct gag ctt gag cag ggc ctg etc tac atg cca cag 3 840 
Leu Pro Pro Cys Pro Glu Leu Glu Gin Gly Leu Leu Tyr Met Pro Gin 
1265 1270 1275 1280 

gag etc act gtg tec gat agt gtg ttg gtt ttt gag ctt acg gat ata 3888 
Glu Leu Thr Val Ser Asp Ser Val Leu Val Phe Glu Leu Thr Asp lie 
1285 1290 1295 

gtt cat tgc cgc atg gec get cca age cag cga aag get gtt etc tea 3 936 
Val His Cys Arg Met Ala Ala Pro Ser Gin Arg Lys Ala Val Leu Ser 
1300 1305 1310 

aca ctt gtg ggg agg tat ggc cgt agg acg aaa eta tat gag gcg gcg 3 984 
Thr Leu Val Gly Arg Tyr Gly Arg Arg Thr Lys Leu Tyr Glu Ala Ala 
1315 1320 1325 

cat tea gat gtt cgt gag tec eta get agg ttc ate cct act ate ggg 4032 
His Ser Asp Val Arg Glu Ser Leu Ala Arg Phe lie Pro Thr lie Gly 
1330 1335 1340 

cct gtt cag get acc aca tgt gag ttg tat gag ttg gtt gag get atg 4 080 
Pro Val Gin Ala Thr Thr Cys Glu Leu Tyr Glu Leu Val Glu Ala Met 
1345 1350 1355 1360 

gtg gag aaa ggt cag gac ggc tct gca gtc tta gag ctt gat ctt tgt 4128 
Val Glu Lys Gly Gin Asp Gly Ser Ala Val Leu Glu Leu Asp Leu Cys 
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1365 1370 1375 

aat cgt gat gtc teg cgc ate aca ttt ttc caa aaa gwe tgc aac aag 4176 
Asn Arg Asp Val Ser Arg lie Thr Phe Phe Gin Lys Xaa Cys Asn Lys 
1380 1385 1390 

ttt aca act ggt gag acc ate gec cac ggc aag gtt ggc cag ggt ata 4224 
Phe Thr Thr Gly Glu Thr lie Ala His Gly Lys Val Gly Gin Gly lie 
1395 1400 1405 

teg gec tgg agt aag acc ttc tgc get ctg ttc ggc ccg tgg ttc cgc 42 72 
Ser Ala Trp Ser Lys Thr Phe Cys Ala Leu Phe Gly Pro Trp Phe Arg 
1410 1415 1420 

gec att gaa aaa gaa ata ttg gee ctg etc ccg cct aat ate ttt tat 4320 
Ala lie Glu Lys Glu lie Leu Ala Leu Leu Pro Pro Asn lie Phe Tyr 
1425 1430 1435 1440 

ggc gac get tat gag gag tea gtt ttt gee gee get gtg tec ggg gcg 436 8 
Gly Asp Ala Tyr Glu Glu Ser Val Phe Ala Ala Ala Val Ser Gly Ala 
1445 1450 1455 

ggg tea tgt atg gta ttt gaa aat gac ttt tea gag ttt gac agt acc 4416 
Gly Ser Cys Met Val Phe Glu Asn Asp Phe Ser Glu Phe Asp Ser Thr 
1460 1465 1470 

cag aat aat ttc tct ctt ggc ctt gag tgt gtg gtt atg gag gag tgc 4464 
Gin Asn Asn Phe Ser Leu Gly Leu Glu Cys Val Val Met Glu Glu Cys 
1475 1480 1485 

ggc atg cct caa tgg eta att agg ttg tac cat ctg gtt egg tct gee 4512 
Gly Met Pro Gin Trp Leu lie Arg Leu Tyr His Leu Val Arg Ser Ala 
1490 1495 1500 

tgg att ctg cag gcg ecg aag gag tct ctt aag ggt ttc tgg aag aag 456 0 
Trp lie Leu Gin Ala Pro Lys Glu Ser Leu Lys Gly Phe Trp Lys Lys 
1505 1510 1515 1520 

cat tct ggt gag cct ggt acc ctt ctt tgg aat acc gtc tgg aat atg 460 8 
His Ser Gly Glu Pro Gly Thr Leu Leu Trp Asn Thr Val Trp Asn Met 
1525 1530 1535 

gcg att ata gca cat tgc tat gag ttc cgt gac ttt cgt gtt get gec 4656 
Ala lie lie Ala His Cys Tyr Glu Phe Arg Asp Phe Arg Val Ala Ala 
1540 1545 1550 

ttt aag ggt gat gat teg gtg gtc etc tgt agt gac tac cga cag age 4 704 
Phe Lys Gly Asp Asp Ser Val Val Leu Cys Ser Asp Tyr Arg Gin Ser 
1555 1560 1565 

cgc aat gca get gec tta att get ggc tgt ggg etc aaa ttg aag gtt 4 752 
Arg Asn Ala Ala Ala Leu lie Ala Gly Cys Gly Leu Lys Leu Lys Val 
1570 1575 1580 

gat tac cgc cct ate ggg ctg tat get ggg gtg gtg gtg gee ccc ggt 4800 
Asp Tyr Arg Pro lie Gly Leu Tyr Ala Gly Val Val Val Ala Pro Gly 
1585 1590 1595 1600 
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ttg ggg aca ctg ccc gat gtg gtg cgt ttt get ggt egg ttg tct gaa 4848 
Leu Gly Thr Leu Pro Asp Val Val Arg Phe Ala Gly Arg Leu Ser Glu 
1605 1610 1615 

aag aat tgg ggc ccc ggc ccg gaa cgt get gag cag ctg cgt ctt get 4896 
Lys Asn Trp Gly Pro Gly Pro Glu Arg Ala Glu Gin Leu Arg Leu Ala 
1620 1625 1630 

gtc tgc gac ttc ctt cga ggg ttg acg aat gtt gcg cag gtc tgt gtt 4944 
Val Cys Asp Phe Leu Arg Gly Leu Thr Asn Val Ala Gin Val Cys Val 
1635 1640 1645 

gat gtt gtg tec cgt gtc tat gga gtc age ccc ggg etc gta cat aac 4 992 
Asp Val Val Ser Arg Val Tyr Gly Val Ser Pro Gly Leu Val His Asn 
1650 1655 1660 

ctt att ggc atg ctg cag acc ate gee gat ggc aag gee cac ttt aca 5040 
Leu lie Gly Met Leu Gin Thr lie Ala Asp Gly Lys Ala His Phe Thr 
1665 1670 1675 1680 

gag act att aaa cct gta ctt gat etc aca aat tec ate ata cag egg 5088 
Glu Thr lie Lys Pro Val Leu Asp Leu Thr Asn Ser lie lie Gin Arg 
1685 1690 1695 

gtg gaa tga ataacatgtc ttttgeateg cccatgggat cacc atg cgc cct agg 5143 
Val Glu Met Arg Pro Arg 

1700 

get gtt ctg ttg ttg ttc etc atg ttt ctg cct atg ctg ccc gcg cca 5191 
Ala Val Leu Leu Leu Phe Leu Met Phe Leu Pro Met Leu Pro Ala Pro 
1705 1710 1715 

ccg gec ggt cag ccg tct ggc cgt cgc cgt ggg egg cgc age ggc ggt 5239 
Pro Ala Gly Gin Pro Ser Gly Arg Arg Arg Gly Arg Arg Ser Gly Gly 
1720 1725 1730 1735 

gee ggc ggt ggt ttc tgg agt gac agg gtt gat tct cag ccc ttc gee 5287 
Ala Gly Gly Gly Phe Trp Ser Asp Arg Val Asp Ser Gin Pro Phe Ala 
1740 1745 1750 

etc ccc tat att cat cca acc aac ccc ttc gee gec gat gtc gtt tea 5335 
Leu Pro Tyr lie His Pro Thr Asn Pro Phe Ala Ala Asp Val Val Ser 
1755 1760 1765 

caa ccc ggg get gga act cgc cct cga cag ccg ccc cgc ccc etc ggt 5383 
Gin Pro Gly Ala Gly Thr Arg Pro Arg Gin Pro Pro Arg Pro Leu Gly 
1770 1775 1780 

tec get tgg cgt gac cag tec aag cgc ccc tec gtt gee ccc cgt cgt 5431 
Ser Ala Trp Arg Asp Gin Ser Lys Arg Pro Ser Val Ala Pro Arg Arg 
1785 1790 1795 



cga tct acc cca get ggg get gcg ccg eta act gec ata tea cca gee 
Arg Ser Thr Pro Ala Gly Ala Ala Pro Leu Thr Ala lie Ser Pro Ala 
1800 1805 1810 1815 
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cct gat aca get cct gta cct gat gtt gac tea cgt ggt get att ttg 552 7 
Pro Asp Thr Ala Pro Val Pro Asp Val Asp Ser Arg Gly Ala lie Leu 
1820 1825 1830 

cgc egg cag tac aat ttg tct acg tec ccg ctt aca tea tct gtt get 5575 
Arg Arg Gin Tyr Asn Leu Ser Thr Ser Pro Leu Thr Ser Ser Val Ala 
1835 1840 1845 

tct ggt act aat ctg gtt etc tat get gee ccg ctg aac cct etc ttg 5623 
Ser Gly Thr Asn Leu Val Leu Tyr Ala Ala Pro Leu Asn Pro Leu Leu 
1850 1855 1860 

cct ctt cag gat ggc ace aac act cat att atg get act gag gca tct 5671 
Pro Leu Gin Asp Gly Thr Asn Thr His lie Met Ala Thr Glu Ala Ser 
1865 1870 1875 

Q aat tac gee cag tat egg gtt gtt egg get acg att cgt tat cgc ccg 5719 

8 Asn Tyr Ala Gin Tyr Arg Val Val Arg Ala Thr lie Arg Tyr Arg Pro 

4S 1880 1885 1890 1895 

[ft 

nj ttg gtg cca aat get gtt ggt ggt tat get ate tct att tct ttc tgg 5767 

^ Leu Val Pro Asn Ala Val Gly Gly Tyr Ala lie Ser lie Ser Phe Trp 

1900 1905 1910 

~- cct caa act aca act ace cct act tct gtt gac atg aat tct ate act 5815 

^ Pro Gin Thr Thr Thr Thr Pro Thr Ser Val Asp Met Asn Ser lie Thr 
! r A 1915 1920 1925 

fll: 

rUi tct act gat gtc agg ate ttg gtc cag ccc ggt ata gee tec gag tta 5863 

M. Ser Thr Asp Val Arg lie Leu Val Gin Pro Gly lie Ala Ser Glu Leu 
sj 1930 1935 1940 

gtc ate cct agt gaa cgc ctt cac tac cgc aac caa ggc tgg cgc tct 5911 
Val lie Pro Ser Glu Arg Leu His Tyr Arg Asn Gin Gly Trp Arg Ser 
1945 1950 1955 

gtt gag acc acg ggt gtg gee gaa gag gag get ace tec ggt ctg gta 5959 
Val Glu Thr Thr Gly Val Ala Glu Glu Glu Ala Thr Ser Gly Leu Val 
I960 1965 1970 1975 

atg ctt tgt att cat ggc tec cct gtt aac tec tac act aat aca cct 6007 
Met Leu Cys lie His Gly Ser Pro Val Asn Ser Tyr Thr Asn Thr Pro 
1980 1985 1990 

tac acc ggt gca ttg ggg ctt ctt gat ttt gca tta gaa ctt gaa ttt 6055 
Tyr Thr Gly Ala Leu Gly Leu Leu Asp Phe Ala Leu Glu Leu Glu Phe 
1995 2000 2005 

aga aat ttg aca ccc ggg aac act aac acc cgt gtt tec egg tat act 6103 
Arg Asn Leu Thr Pro Gly Asn Thr Asn Thr Arg Val Ser Arg Tyr Thr 
2010 2015 2020 

age aca gee cgc cac egg ctg cgc cgc ggt get gat ggg acc get gag 6151 
Ser Thr Ala Arg His Arg Leu Arg Arg Gly Ala Asp Gly Thr Ala Glu 
2025 2030 2035 



etc acc acc aca gca gee aca cgc ttc atg aag gat ttg cat ttt act 
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Leu Thr Thr Thr Ala Ala Thr Arg Phe Met Lys Asp Leu His Phe Thr 
2040 2045 2050 2055 

ggt acg aac ggc gtt ggt gag gtg ggt cgt ggt att gcc ctg act ctg 6247 
Gly Thr Asn Gly Val Gly Glu Val Gly Arg Gly lie Ala Leu Thr Leu 
2060 2065 2070 

ttt aat ctt get gat acg ctt ctt ggt ggt tta ccg aca gaa ttg att 6295 
Phe Asn Leu Ala Asp Thr Leu Leu Gly Gly Leu Pro Thr Glu Leu lie 
2075 2080 2085 

teg teg get ggg ggt caa ctg ttt tac tec cgc cct gtt gtc teg gcc 6343 
Ser Ser Ala Gly Gly Gin Leu Phe Tyr Ser Arg Pro Val Val Ser Ala 
2090 2095 2100 

aat ggc gag cca aca gta aag tta tac aca tct gtt gag aat gcg cag 63 91 
Asn Gly Glu Pro Thr Val Lys Leu Tyr Thr Ser Val Glu Asn Ala Gin 
2105 2110 2115 

caa gac aag ggc ate ace att cca cac gac ata gat tta ggt gac tec 643 9 
Gin Asp Lys Gly lie Thr lie Pro His Asp lie Asp Leu Gly Asp Ser 
2120 2125 2130 2135 

cgt gtg gtt ate cag gat tat gat aac cag cac gaa caa gat cga cct 6487 
Arg Val Val lie Gin Asp Tyr Asp Asn Gin His Glu Gin Asp Arg Pro 
2140 2145 2150 



ace ccg tea cct gcc ccc tec cgc cct ttc tea gtt ctt cgt gcc aat 6535 
Thr Pro Ser Pro Ala Pro Ser Arg Pro Phe Ser Val Leu Arg Ala Asn 
=]. 2155 2160 2165 

Jv 

gat gtt ttg tgg etc tct etc act gcc get gag tac grc cag ace acg 6583 
Asp Val Leu Trp Leu Ser Leu Thr Ala Ala Glu Tyr Xaa Gin Thr Thr 
2170 2175 2180 

tat ggg teg tec ace aac cct atg tat gtc tct gat aca gtc acg ctt 6631 
Tyr Gly Ser Ser Thr Asn Pro Met Tyr Val Ser Asp Thr Val Thr Leu 
2185 2190 2195 

gtt aat gta gcc act ggt get cag get gtt gcc cgc tct ctt gac tgg 6679 
Val Asn Val Ala Thr Gly Ala Gin Ala Val Ala Arg Ser Leu Asp Trp 
2200 2205 2210 2215 

tct aaa gtt act ctg gat ggt cgc cct ctt act acc att cag cag tat 6727 
Ser Lys Val Thr Leu Asp Gly Arg Pro Leu Thr Thr lie Gin Gin Tyr 
2220 2225 2230 

tct aag aaa ttt tat gtt etc ccg ctt cgs ggg aag ctg tec ttt tgg 6775 
Ser Lys Lys Phe Tyr Val Leu Pro Leu Xaa Gly Lys Leu Ser Phe Trp 
2235 2240 2245 

gag get ggt acg acc aag gcc ggc tac ccg tat aat tat aat acc act 682 3 
Glu Ala Gly Thr Thr Lys Ala Gly Tyr Pro Tyr Asn Tyr Asn Thr Thr 
2250 2255 2260 



get agt gac caa att ttg att gag aac gcg gcc ggt cac cgt gtc gcc 
Ala Ser Asp Gin lie Leu lie Glu Asn Ala Ala Gly His Arg Val Ala 
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2265 2270 2275 

att tct act tat acc act agt ttg ggt gcc ggc cct acc teg aty tct 6919 
lie Ser Thr Tyr Thr Thr Ser Leu Gly Ala Gly Pro Thr Ser Xaa Ser 
2280 2285 2290 2295 

gcg gtc ggt gta eta get cca cat teg gcc ctt get gtt etc gag gat 6967 
Ala Val Gly Val Leu Ala Pro His Ser Ala Leu Ala Val Leu Glu Asp 
2300 2305 2310 

act gtt gat tat cct get cgt gcc cat act ttt gat gat ttc tgc ccg 7015 
Thr Val Asp Tyr Pro Ala Arg Ala His Thr Phe Asp Asp Phe Cys Pro 
2315 2320 2325 

gag tgt cgc acc ctt ggt ctg cag ggt tgt gca ttc caa tct act att 7 063 
Glu Cys Arg Thr Leu Gly Leu Gin Gly Cys Ala Phe Gin Ser Thr lie 
2330 2335 2340 

get gaa ctt cag cgt ctt aaa atg aag gta ggt aaa acc egg gag tct 7111 
Ala Glu Leu Gin Arg Leu Lys Met Lys Val Gly Lys Thr Arg Glu Ser 
2345 2350 2355 

taa ttaattcctt ttgtgccccc ttcgcagttc tctttggctt tatttctcat 7164 

2360 

ttctgettte cgcgctccct ggaaaaaaaa aaaaaaaa 72 02 



<210> 91 
<211> 1698 
<212> PRT 

<213> Hepatitis E virus 
<400> 91 

Pro Gly lie Thr Thr Ala lie Glu Gin Ala Ala Leu Ala Ala Ala Asn 
15 10 15 

Ser Ala Leu Ala Asn Ala Val Val Val Arg Pro Phe Leu Ser Arg Val 
20 25 30 

Gin Thr Glu lie Leu lie Asn Leu Met Gin Pro Arg Gin Leu Val Phe 
35 40 45 

Arg Pro Glu Val Leu Trp Asn His Pro lie Gin Arg Val lie His Asn 
50 55 60 

Glu Leu Glu Gin Tyr Cys Arg Ala Arg Ala Gly Arg Cys Leu Glu Val 
65 70 75 80 

Gly Ala His Pro Arg Ser lie Asn Asp Asn Pro Asn Val Leu His Arg 
85 90 95 

Cys Phe Leu Arg Pro Val Gly Arg Asp Val Gin Arg Trp Tyr Ser Ala 
100 105 110 



Pro Thr Arg Gly Pro Ala Ala Asn Cys Arg Arg Ser Ala Leu Arg Gly 
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115 120 125 

Leu Pro Pro Ala Asp Arg Thr Tyr Cys Phe Asp Gly Phe Ser Arg Cys 
130 135 140 

Ala Phe Ala Ala Glu Thr Gly Val Ala Leu Tyr Ser Leu His Asp Leu 
145 150 155 160 

Trp Pro Ala Asp Val Ala Glu Ala Met Ala Arg His Gly Xaa Thr Arg 
165 170 175 

Leu Tyr Ala Ala Leu His Leu Pro Pro Glu Val Leu Leu Pro Pro Gly 
180 185 190 

Thr Tyr His Thr Thr Ser Tyr Leu Leu lie His Asp Gly Asp Arg Ala 
195 200 205 

Val Val Thr Tyr Glu Gly Asp Thr Ser Ala Gly Tyr Asn His Asp Val 
210 215 220 

Ser lie Leu Arg Ala Trp lie Arg Thr Thr Lys lie Val Gly Asp His 
225 230 235 240 

Pro Leu Val lie Glu Arg Val Arg Ala lie Gly Cys His Phe Val Leu 
245 250 255 

Leu Leu Thr Ala Ala Pro Glu Pro Ser Pro Met Pro Tyr Val Pro Tyr 
260 265 270 

Pro Arg Ser Thr Glu Val Tyr Val Arg Ser lie Phe Gly Pro Gly Gly 
275 280 285 

Ser Pro Ser Leu Phe Pro Ser Ala Cys Ser Thr Lys Ser Thr Phe His 
290 295 300 

Ala Val Pro Val His lie Trp Asp Arg Leu Met Leu Phe Gly Ala Thr 
305 310 315 320 

Leu Asp Asp Gin Ala Phe Cys Cys Ser Arg Leu Met Thr Tyr Leu Arg 
325 330 335 

Gly lie Ser Tyr Lys Val Thr Val Gly Ala Leu Val Ala Asn Glu Gly 
340 345 350 

Trp Asn Ala Ser Glu Asp Ala Leu Thr Ala Xaa lie Thr Ala Ala Tyr 
355 360 365 

Leu Thr lie Cys His Gin Arg Tyr Leu Arg Thr Gin Ala lie Ser Lys 
370 375 380 

Gly Met Arg Arg Leu Gly Val Glu His Ala Gin Lys Phe lie Thr Arg 
385 390 395 400 

Leu Tyr Ser Trp Leu Phe Glu Lys Ser Gly Arg Asp Tyr lie Pro Gly 
405 410 415 

Arg Gin Leu Gin Phe Tyr Ala Gin Cys Arg Arg Trp Leu Ser Ala Gly 
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420 425 430 

Phe His Leu Asp Pro Arg Val Leu Val Phe Asp Glu Ser Val Pro Cys 
435 440 445 

Arg Cys Arg Thr Phe Leu Lys Lys Val Ala Gly Lys Phe Cys Cys Phe 
450 455 460 

Met Arg Trp Leu Gly Gin Glu Cys Thr Cys Phe Leu Glu Pro Ala Glu 
465 470 475 480 

Gly Leu Val Gly Asp His Gly His Asp Asn Glu Ala Tyr Glu Gly Ser 
485 490 495 

Glu Val Asp Pro Ala Glu Pro Ala His Leu Asp Val Ser Gly Thr Tyr 
500 505 510 

Ala Val His Gly His Gin Leu Glu Ala Leu Tyr Arg Ala Leu Asn Val 
515 520 525 

Pro Gin Asp lie Ala Ala Arg Ala Ser Arg Leu Thr Ala Thr Val Glu 
530 535 540 

Leu Val Ala Ser Pro Asp Arg Leu Glu Cys Arg Thr Val Leu Gly Asn 
545 550 555 560 

Lys Thr Phe Arg Thr Thr Val Val Asp Gly Ala His Leu Glu Ala Asn 
565 570 575 

Gly Pro Glu Gin Tyr Val Leu Ser Phe Asp Ala Ser Arg Gin Ser Met 
580 585 590 

Gly Ala Gly Ser His Ser Leu Thr Tyr Glu Leu Thr Pro Ala Gly Leu 
595 600 605 

Gin Val Arg lie Ser Ser Asn Gly Leu Asp Cys Thr Ala Thr Phe Pro 
610 615 620 

Pro Gly Gly Ala Pro Ser Ala Ala Pro Gly Glu Val Ala Ala Phe Cys 
625 630 635 640 

Ser Ala Leu Tyr Arg Tyr Asn Arg Phe Thr Gin Arg His Ser Leu Thr 
645 650 655 

Gly Gly Leu Trp Leu His Pro Glu Gly Leu Leu Gly lie Phe Pro Pro 
660 665 670 

Phe Ser Pro Gly His lie Trp Glu Ser Ala Asn Pro Phe Cys Gly Glu 
675 680 685 

Gly Thr Leu Tyr Thr Arg Thr Trp Ser Thr Ser Gly Phe Ser Ser Asp 
690 695 700 

Phe Ser Pro Pro Glu Ala Ala Ala Pro Ala Met Ala Ala Thr Pro Gly 
705 710 715 720 



Leu Pro His Ser Thr Pro Pro Val Ser Asp lie Trp Val Leu Pro Pro 
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725 730 735 

Pro Ser Glu Glu Phe Gin Val Asp Ala Ala Pro Val Pro Pro Ala Pro 
740 745 750 

Asp Pro Ala Gly Leu Pro Gly Pro Val Val Leu Thr Pro Pro Pro Pro 
755 760 765 

Pro Pro Val His Lys Pro Ser lie Pro Pro Pro Ser Arg Asn Arg Arg 
770 775 780 

Leu Leu Tyr Thr Tyr Pro Asp Gly Ala Lys Val Tyr Ala Gly Ser Leu 
785 790 795 800 

Phe Glu Ser Asp Cys Asp Trp Leu Val Asn Ala Ser Asn Pro Gly His 
805 810 815 

Arg Pro Gly Gly Gly Leu Cys His Ala Phe Tyr Gin Arg Phe Pro Glu 
820 825 830 

Ala Phe Tyr Pro Thr Glu Phe lie Met Arg Glu Gly Leu Ala Ala Tyr 
835 840 845 

Thr Leu Thr Pro Arg Pro lie lie His Ala Val Ala Pro Asp Tyr Arg 
850 855 860 

Val Glu Gin Asn Pro Lys Arg Leu Glu Ala Ala Tyr Arg Glu Thr Cys 
865 870 875 880 

Ser Arg Arg Gly Thr Ala Ala Tyr Pro Leu Leu Gly Ser Gly lie Tyr 
885 890 895 

Gin Val Pro Val Ser Leu Ser Phe Asp Ala Trp Glu Arg Asn His Arg 
900 905 910 

Pro' Gly Asp Glu Leu Tyr Leu Thr Glu Pro Ala Ala Asn Trp Phe Glu 
915 920 925 

Ala Asn Lys Pro Ala Gin Pro Val Leu Thr lie Thr Glu Asp Thr Ala 
930 935 940 

Arg Thr Ala Asn Leu Ala Leu Glu lie Asp Ala Ala Thr Glu Val Gly 
945 950 955 960 

Arg Ala Cys Ala Gly Cys Thr lie Ser Pro Gly lie Val His Tyr Gin 
965 970 975 

Phe Thr Ala Gly Val Pro Gly Ser Gly Lys Ser Arg Ser lie Gin Gin 
980 985 990 

Gly Asp Val Asp Val Val Val Val Pro Thr Arg Glu Leu Arg Asn Ser 
995 1000 1005 

Trp Arg Arg Arg Gly Phe Ala Ala Phe Thr Pro His Thr Ala Ala Arg 
1010 1015 1020 



Val Thr lie Gly Arg Arg Val Val lie Asp Glu Ala Pro Ser Leu Pro 
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025 1030 1035 1040 

Pro His Leu Leu Leu Leu His Met Gin Arg Ala Ser Ser Val His Leu 
1045 1050 1055 

Leu Gly Asp Pro Asn Gin lie Pro Ala lie Asp Phe Glu His Ala Gly 
1060 1065 1070 

Leu Val Pro Ala lie Arg Pro Glu Leu Ala Pro Thr Ser Trp Trp Xaa 
1075 1080 1085 

Val Thr His Arg Cys Pro Ala Asp Val Cys Glu Leu lie Arg Gly Ala 
1090 1095 1100 

Tyr Pro Lys lie Gin Thr Thr Ser Arg Val Leu Arg Ser Leu Phe Trp 
105 1110 1115 1120 

Asn Glu Pro Ala lie Gly Gin Lys Leu Val Xaa Thr Gin Ala Ala Lys 
1125 1130 1135 

Ala Ala Asn Pro Gly Ala lie Thr Val His Glu Ala Gin Gly Ala Thr 
1140 1145 1150 

Phe Thr Glu Thr Thr lie lie Ala Thr Ala Asp Ala Arg Gly Leu He 
1155 1160 1165 

Gin Ser Ser Arg Ala His Ala He Val Ala Leu Thr Arg His Thr Glu 
1170 1175 1180 

Lys Cys Val He Leu Asp Ala Pro Gly Leu Leu Arg Glu Val Gly He 
185 1190 1195 1200 

Ser Asp Val He Val Asn Asn Phe Phe Leu Ala Gly Gly Glu Val Gly 
1205 1210 1215 

Xaa His Arg Pro Ser Val He Pro Arg Gly Asn Pro Asp Gin Asn Leu 
1220 1225 1230 

Gly Thr Leu Gin Ala Phe Pro Pro Ser Cys Gin He Ser Ala Tyr His 
1235 1240 1245 

Gin Leu Ala Glu Glu Leu Gly His Arg Pro Ala Pro Val Ala Ala Val 
1250 1255 1260 

Leu Pro Pro Cys Pro Glu Leu Glu Gin Gly Leu Leu Tyr Met Pro Gin 
265 1270 1275 1280 

Glu Leu Thr Val Ser Asp Ser Val Leu Val Phe Glu Leu Thr Asp He 
1285 1290 1295 

Val His Cys Arg Met Ala Ala Pro Ser Gin Arg Lys Ala Val Leu Ser 
1300 1305 1310 

Thr Leu Val Gly Arg Tyr Gly Arg Arg Thr Lys Leu Tyr Glu Ala Ala 
1315 1320 1325 

His Ser Asp Val Arg Glu Ser Leu Ala Arg Phe He Pro Thr He Gly 
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1330 1335 1340 

Pro Val Gin Ala Thr Thr Cys Glu Leu Tyr Glu Leu Val Glu Ala Met 
345 1350 1355 1360 

Val Glu Lys Gly Gin Asp Gly Ser Ala Val Leu Glu Leu Asp Leu Cys 
1365 1370 1375 

Asn Arg Asp Val Ser Arg He Thr Phe Phe Gin Lys Xaa Cys Asn Lys 
1380 1385 1390 

Phe Thr Thr Gly Glu Thr He Ala His Gly Lys Val Gly Gin Gly He 
1395 1400 1405 

Ser Ala Trp Ser Lys Thr Phe Cys Ala Leu Phe Gly Pro Trp Phe Arg 
1410 1415 1420 

Ala He Glu Lys Glu He Leu Ala Leu Leu Pro Pro Asn He Phe Tyr 
425 1430 1435 1440 

Gly Asp Ala Tyr Glu Glu Ser Val Phe Ala Ala Ala Val Ser Gly Ala 
1445 1450 1455 

Gly Ser Cys Met Val Phe Glu Asn Asp Phe Ser Glu Phe Asp Ser Thr 
1460 1465 1470 

Gin Asn Asn Phe Ser Leu Gly Leu Glu Cys Val Val Met Glu Glu Cys 
1475 1480 1485 

Gly Met Pro Gin Trp Leu He Arg Leu Tyr His Leu Val Arg Ser Ala 
1490 1495 1500 

Trp He Leu Gin Ala Pro Lys Glu Ser Leu Lys Gly Phe Trp Lys Lys 
505 1510 1515 1520 

His Ser Gly Glu Pro Gly Thr Leu Leu Trp Asn Thr Val Trp Asn Met 
1525 1530 1535 

Ala He He Ala His Cys Tyr Glu Phe Arg Asp Phe Arg Val Ala Ala 
1540 1545 1550 

Phe Lys Gly Asp Asp Ser Val Val Leu Cys Ser Asp Tyr Arg Gin Ser 
1555 1560 1565 

Arg Asn Ala Ala Ala Leu He Ala Gly Cys Gly Leu Lys Leu Lys Val 
1570 1575 1580 

Asp Tyr Arg Pro He Gly Leu Tyr Ala Gly Val Val Val Ala Pro Gly 
585 1590 1595 1600 

Leu Gly Thr Leu Pro Asp Val Val Arg Phe Ala Gly Arg Leu Ser Glu 
1605 1610 1615 

Lys Asn Trp Gly Pro Gly Pro Glu Arg Ala Glu Gin Leu Arg Leu Ala 
1620 1625 1630 

Val Cys Asp Phe Leu Arg Gly Leu Thr Asn Val Ala Gin Val Cys Val 
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1635 1640 1645 

Asp Val Val Ser Arg Val Tyr Gly Val Ser Pro Gly Leu Val His Asn 
1650 1655 1660 

Leu lie Gly Met Leu Gin Thr He Ala Asp Gly Lys Ala His Phe Thr 
665 1670 1675 1680 

Glu Thr He Lys Pro Val Leu Asp Leu Thr Asn Ser He He Gin Arg 
1685 1690 1695 

Val Glu 



<210> 92 
<211> 660 
<212> PRT 

<213> Hepatitis E virus 
<400> 92 

Met Arg Pro Arg Ala Val Leu Leu Leu Phe Leu Met Phe 
15 10 

Leu Pro Met Leu Pro Ala Pro Pro Ala Gly Gin Pro Ser Gly Arg Arg 
15 20 25 

Arg Gly Arg Arg Ser Gly Gly Ala Gly Gly Gly Phe Trp Ser Asp Arg 
30 35 40 45 

Val Asp Ser Gin Pro Phe Ala Leu Pro Tyr He His Pro Thr Asn Pro 
50 55 60 

Phe Ala Ala Asp Val Val Ser Gin Pro Gly Ala Gly Thr Arg Pro Arg 
65 70 75 

Gin Pro Pro Arg Pro Leu Gly Ser Ala Trp Arg Asp Gin Ser Lys Arg 
80 85 90 

Pro Ser Val Ala Pro Arg Arg Arg Ser Thr Pro Ala Gly Ala Ala Pro 
95 100 105 

Leu Thr Ala He Ser Pro Ala Pro Asp Thr Ala Pro Val Pro Asp Val 
10 115 120 125 

Asp Ser Arg Gly Ala He Leu Arg Arg Gin Tyr Asn Leu Ser Thr Ser 
130 135 140 

Pro Leu Thr Ser Ser Val Ala Ser Gly Thr Asn Leu Val Leu Tyr Ala 
145 150 155 

Ala Pro Leu Asn Pro Leu Leu Pro Leu Gin Asp Gly Thr Asn Thr His 
160 165 170 

He Met Ala Thr Glu Ala Ser Asn Tyr Ala Gin Tyr Arg Val Val Arg 
175 180 185 • 



Ala Thr He Arg Tyr Arg Pro Leu Val Pro Asn Ala Val Gly Gly Tyr 
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90 195 200 205 

Ala lie Ser lie Ser Phe Trp Pro Gin Thr Thr Thr Thr Pro Thr Ser 
210 215 220 

Val Asp Met Asn Ser lie Thr Ser Thr Asp Val Arg lie Leu Val Gin 
225 230 235 

Pro Gly lie Ala Ser Glu Leu Val lie Pro Ser Glu Arg Leu His Tyr 
240 245 250 

Arg Asn Gin Gly Trp Arg Ser Val Glu Thr Thr Gly Val Ala Glu Glu 
255 260 265 

Glu Ala Thr Ser Gly Leu Val Met Leu Cys lie His Gly Ser Pro Val 
70 275 280 285 

Asn Ser Tyr Thr Asn Thr Pro Tyr Thr Gly Ala Leu Gly Leu Leu Asp 
290 295 300 

Phe Ala Leu Glu Leu Glu Phe Arg Asn Leu Thr Pro Gly Asn Thr Asn 
305 310 315 

Thr Arg Val Ser Arg Tyr Thr Ser Thr Ala Arg His Arg Leu Arg Arg 
320 325 330 

Gly Ala Asp Gly Thr Ala Glu Leu Thr Thr Thr Ala Ala Thr Arg Phe 
335 340 345 

Met Lys Asp Leu His Phe Thr Gly Thr Asn Gly Val Gly Glu Val Gly 
50 355 360 365 

Arg Gly lie Ala Leu Thr Leu Phe Asn Leu Ala Asp Thr Leu Leu Gly 
370 375 380 

Gly Leu Pro Thr Glu Leu lie Ser Ser Ala Gly Gly Gin Leu Phe Tyr 
385 390 395 

Ser Arg Pro Val Val Ser Ala Asn Gly Glu Pro Thr Val Lys Leu Tyr 
400 405 410 

Thr Ser Val Glu Asn Ala Gin Gin Asp Lys Gly lie Thr lie Pro His 
415 420 425 

Asp lie Asp Leu Gly Asp Ser Arg Val Val lie Gin Asp Tyr Asp Asn 
30 435 440 445 

Gin His Glu Gin Asp Arg Pro Thr Pro Ser Pro Ala Pro Ser Arg Pro 
450 455 460 

Phe Ser Val Leu Arg Ala Asn Asp Val Leu Trp Leu Ser Leu Thr Ala 
465 470 475 

Ala Glu Tyr Xaa Gin Thr Thr Tyr Gly Ser Ser Thr Asn Pro Met Tyr 
480 485 490 

Val Ser Asp Thr Val Thr Leu Val Asn Val Ala Thr Gly Ala Gin Ala 
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495 500 505 

Val Ala Arg Ser Leu Asp Trp Ser Lys Val Thr Leu Asp Gly Arg Pro 
10 515 520 525 

Leu Thr Thr lie Gin Gin Tyr Ser Lys Lys Phe Tyr Val Leu Pro Leu 
530 535 540 

Xaa Gly Lys Leu Ser Phe Trp Glu Ala Gly Thr Thr Lys Ala Gly Tyr 
545 550 555 

Pro Tyr Asn Tyr Asn Thr Thr Ala Ser Asp Gin lie Leu lie Glu Asn 
560 565 570 

Ala Ala Gly His Arg Val Ala lie Ser Thr Tyr Thr Thr Ser Leu Gly 
575 580 585 

Ala Gly Pro Thr Ser Xaa Ser Ala Val Gly Val Leu Ala Pro His Ser 
90 595 600 605 

Ala Leu Ala Val Leu Glu Asp Thr Val Asp Tyr Pro Ala Arg Ala His 
610 615 620 

Thr Phe Asp Asp Phe Cys Pro Glu Cys Arg Thr Leu Gly Leu Gin Gly 
625 630 635 

Cys Ala Phe Gin Ser Thr lie Ala Glu Leu Gin Arg Leu Lys Met Lys 
640 645 650 

Val Gly Lys Thr Arg Glu Ser 
655 660 



<210> 93 
<211> 122 
<212> PRT 

<213> Hepatitis E virus 
<220> 

<223> ORF3 HEV US-1 
<400> 93 

Met Asn Asn Met Ser Phe Ala Ser Pro Met Gly Ser Pro Cys Ala Leu 
15 10 15 

Gly Leu Phe Cys Cys Cys Ser Ser Cys Phe Cys Leu Cys Cys Pro Arg 
20 25 30 

His Arg Pro Val Ser Arg Leu Ala Val Ala Val Gly Gly Ala Ala Ala 
35 40 45 

Val Pro Ala Val Val Ser Gly Val Thr Gly Leu lie Leu Ser Pro Ser 
50 55 60 



Pro Ser Pro lie Phe lie Gin Pro Thr Pro Ser Pro Pro Met Ser Phe 
65 70 75 80 
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His Asn Pro Gly Leu Glu Leu Ala Leu Asp Ser Arg Pro Ala Pro Ser 
85 90 95 

Val Pro Leu Gly Val Thr Ser Pro Ser Ala Pro Pro Leu Pro Pro Val 
100 105 110 

Val Asp Leu Pro Gin Leu Gly Leu Arg Arg 
115 120 



<210> 94 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

B <223> Description of Artificial Sequence: Primer 

O US5P3S/20 

rfl: <400> 94 

ffjv tggcattact actgccattg 2 0 

<210> 95 
<211> 20 
5 <212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Primer 
US5P45S/20 

<400> 95 

caattctgcc ttggcgaatg 2 0 



<210> 96 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
US5P296A 

<400> 96 

aggaaacacc gatgcagaac 2 0 



<210> 97 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
US5P243A/20 
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<400> 97 

tccaacctcc aagcaacgac 20 



<210> 98 
<211> 199 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> Clone 199con 
<400> 98 

caattctgcc ttggcgaatg ctgtggtggt tcggccgttt ctttctcgtg tgcaaactga 6 0 
gattcttatt aatttgatgc aaccccggca gttggtcttc cgccctgagg tgctttggaa 120 
tcatcctatc cagcgggtta tacataatga attagagcag tactgccggg cccgggctgg 180 
tcgttgcttg gaggttgga 199 



<210> 99 
<211> 25 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> JE orfl-s 
<400> 99 

gttctgcatc ggtgtttcct tagac 25 



<210> 100 
<211> 26 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> JE orfl-a 
<400> 100 

gaatcaggag atacgaggtt gtgtgg 26 



<210> 101 
<211> 331 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> US2-320 
<400> 101 

gttctgcatc ggtgtttcct tagaccggtc ggccgagatg ttcagcgctg gtattctgcc 6 0 
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cctacccgtg gtcctgcggc caattgccgc 
gaccgcacct attgttttga tggattttcc 
gccctttact ctttgcatga cctttggcca 
gggatgacac gcttatacgc cgcactgcac 
acctaccaca caacctcgta tctcctgatt 

<210> 102 
<211> 1186 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> US2-1168 
<400> 102 

ctcactgtgt ccgatagtgt gttggttttt 
gccgccccaa gccagcgaaa ggctgttctc 
actaaattat atgaggcggc gcattcagat 
accatcgggc ctgttcgggc taccacatgt 
gagaagggtc aggacggatc tgccgtccta 
cgcatcacat ttttccaaaa ggattgcaat 
ggcaaggttg gccagggcat atcggcctgg 
tggttccgcg ccattgaaaa ggaaatattg 
gacgcctatg aggagtcagt gtttgctgcc 
tttgaaaatg acttctcaga gtttgacagt 
tgtgtggtta tggaggagtg cggcatgccc 
cggtcagcct ggattttgca ggcgccgaag 
tctggtgagc ctggtaccct tctctggaac 
tgctaygagt tccgtgactt tcgtgttgcc 
tgtagtgact accgacagrg ccgtaacgcg 
ttgaaggttg attaccgccc tatcgggcta 
gggacactgc ccgatgtggt gcgttttgcc 
ggcccggagc gtgctgagca gctgcgtctt 



cgctccgcgt tgcgtggtct cccccctgtc 12 0 
cgttgtgctt ttgctgcaga gaccggtgtg 18 0 
gctgatgttg cagaggctat ggcccgccat 24 0 
cttccccccg aggtgctgct accacccggc 300 
c 331 



gagcttacgg atatagtcca ctgccgtatg 6 0 
tcaacgcttg tggggaggta cggccgtagg 12 0 
gtccgtgagt ccctagcgag gtttatcccc 18 0 
gagctgtacg agctggttga agccatggta 24 0 
gagctcgacc tttgcaatcg tgacgtctcg 300 
aagtttacaa ctggtgagac tatcgcccat 36 0 
agcaagacct tctgtgctct gtttggcccg 42 0 
gccctactcc cgcctaatat cttttatggc 480 
gctgtgtccg gggcagggtc atgtatggta 540 
acccagaata atttctctct cggccttgag 6 00 
caatggttaa ttaggttgta ccatctggtc 660 
gagtctctta aggggttttg gaagaagcac 72 0 
actgtctgga acatggcgat tatagcacat 78 0 
gccttcaagg gtgatgattc agtggtcctc 84 0 
gctgccttaa ttgcaggctg tgggctcaaa 90 0 
tatgctggag tggtggtggc ccccggtttg 960 
ggtcggttat ctgagaagaa ttggggccct 1020 
gctgtttgtg atttccttcg agggttgacg 108 0 
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aatgttgcgc aggtctgtgt tgatgttgtg tcccgtgtct atggagttag ccccgggctg 1140 
gtacataacc ttattggcat gctgcagacc atcgccgatg gcaagg 1186 



<210> 103 
<211> 23 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> JE hevdf2/3 si 
<400> 103 

gttccgcttg gcgtgaccag tec 2 3 



<210> 104 
<211> 23 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> JE hevdf2/3 al 
<400> 104 

gagtcaacat caggtacagg age 23 



<210> 105 
<211> 130 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> US2-135 
<400> 105 

gttccgcttg gcgtgaccag tcccagcgcc cctccgctgc cccccgtcgt cgatctgccc 60 
cagctggggc tgcgccgctg actgccgtgt caccggctcc tgacacagct cctgtacctg 12 0 
atgttgactc 13 0 



<210> 106 
<211> 26 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> JE hevdfl-sl 
<400> 106 

gatgtcattt tgtgttgctg ctcacc 26 
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<210> 107 
<211> 23 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> hev216 al 
<400> 107 

cgtcctacag cggcatggta ctg 2 3 



<210> 108 
<211> 564 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> US2-563 



<400> 108 
tcacccatgc 


cttatgttcc 


ttaccctcgt 


tcaacggagg 


tgtatgtccg 


gtctatattt 


60 


ggccctggcg 


gctccccatc 


cttgtttcca 


tcagcctgct 


ctactaaatc 


tacctttcat 


120 


gctgtcccgg 


ttcacatctg 


ggatcrgctc 


atgctctttg 


gtgccaccct 


gracgatcag 


180 


gcgttctgct 


gttcacggct 


tatgacttac 


ctccgtggta 


ttagttataa 


ggtcactgtc 


240 


ggtgcgcttg 


tcgctaatga 


ggggtggaac 


gcctctgagg 


atgctcttac 


tgcagtgatc 


300 


actgcggcct 


atctgaccat 


ctgccatcag 


cgttaccttc 


gcacccaggc 


gatttccaag 


360 


ggcatgcgcc 


ggttggaggt 


tgagcatgct 


cagaaattta 


tcacaagact 


ctacagctgg 


420 


ctatttgaga 


agtctggccg 


tgactacatc 


cccggccgcc 


agcttcaatt 


ttatgcacaa 


480 


tgccgacggt 


ggctttctgc 


aggcttccac 


ctaracccca 


ggrtgcttgt 


ctttgatgaa 


540 


tcagtaccat 


gccgctgtag 


gacg 








564 



<210> 109 
<211> 24 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> USorf 2 . 1 » 
<400> 109 

gtggagctag tacaccgacc gcag 24 



<210> 110 
<211> 678 
<212> DNA 
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<213> Hepatitis E virus 
<220> 

<223> US2-667 
<400> 110 

cgcttcttgg tggtttaccg acagaattga tttcgtcggc tgggggccaa ctgttttact 6 0 
cccgcccggt tgtctcagcc aatggcgagc caacagtaaa gttatataca tctgttgaga 12 0 
atgcgcagca agacaagggc atcaccattc cacatgatat agacctgggt gactcccgtg 18 0 
tggttatcca ggattatgat aaccagcayg agcaagaccg acctactccg tcacctgccc 24 0 
cctctcgccc cttctcagtt cttcgtgcca atgatgtttt gtggctttcc ctcactgccg 300 
ctgagtatga ccagactacg tatgggtcgt ccaccaaccc tatgtatgtc tctgacacag 36 0 
ttacgcttgt taatgtggct actggtgctc aggctgttgc ccgctccctt gattggtcta 42 0 
aagttactct ggacggccgc ccccttacta ccattcagca gtattctaag acattttatg 480 
ttctcccgct ccgcgggaag ctgtcctttt gggaggctgg cacgactaag gccggctacc 54 0 
cttacaatta taatactacc gctagtgacc aaattttgat tgagaatgcg gccggccacc 600 
gtgtcgctat ttccacctat accactagct taggtgccgg tcctacctcg atctctgcgg 660 
tcggtgtact agctccac 678 



<210> 111 
<211> 23 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> hev3301s 
<400> 111 

gtatgcgagc tcatccgtgg tgc 2 3 



<210> 112 
<211> 25 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> JE hevl67-al 
<400> 112 

ccaacacact atcggacaca gtgag 2 5 



<210> 113 
<211> 580 
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<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> US2-579 
<400> 113 

gtatgcgagc tcatccgtgg tgcctacccc aaaattcaga ccacgagccg tgtgctacgg 60 
tccctgtttt ggaacgaacc ggccatcggc caaaagttgg tttttacgca ggctgctaag 120 
gctgccaacc ctggtgcgat tacggttcac gaagctcagg gtgctacttt cacggagacc 180 
acaattatag ccacggccga cgctaggggc ctcattcagt catcccgggc ccatgctata 24 0 
gtcgcactca cccgccatac tgagaagtgt gttattttgg atgcccccgg cttgttgcgc 300 
gaggtcggca tttcggatgt tattgtcaat aactttttcc ttgccggtgg agaggtcggc 36 0 
catcaccgcc cttctgtgat acctcgcggc aatcctgatc agaacctcgg gactctacag 42 0 
gcctttccgc cgtcatgtca gatcagtgct taccatcagt tggctgagga actaggtcat 48 0 
cgcccggccc ctgtcgccgc cgtcttgccc ccttgccctg agcttgagca gggcctgctc 54 0 
tatatgccac aagaactcac tgtgtccgat agtgtgttgg 580 



<210> 114 
<211> 26 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> HEV459 si 
<400> 114 

cagaaattta tcacaagact ctacag 2 6 



<210> 115 
<211> 26 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> HEV459 S3 
<400> 115 

ctctacagtt ggctatttga gaagtc 26 



<210> 116 
<211> 25 
<212> DNA 

<213> Hepatitis E virus 
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<220> 

<223> JE1955a 
<400> 116 

ctataaagag ctgagcagaa ggcgg 

<210> 117 
<211> 734 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> US2-733 
<400> 117 



ctctacagtt 


ggctatttga 


gaagtctggc 


ttttatgcac 


aatgccgacg 


gtggctttct 


gtctttgatg 


aatcagtgcc 


atgccgttgc 


ttctgctgtt 


ttatgcggtg 


gctggggcag 


ggtttagttg 


gtgatcaagg 


tcatgacaac 


gctgagcctg 


cacatcttga 


tgtctcgggg 


gccctctata 


gggcacttaa 


tgtcccacat 


gctactgttg 


agctcgttgc 


tagtccggac 


aagaccttcc 


ggacgacggt 


ggttgatggc 


tatgttctgt 


catttgacgc 


ctctcgccag 


tatgagctca 


cccctgccgg 


tctgcaggta 


gccacattcc 


ccccyggtgg 


cgcccctagc 


tcagctcttt 


atag 





<210> 118 
<211> 22 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> JE 2950mex s 
<400> 118 

gtgtccccgg ctctggcaag tc 



cgtgactaca tccccggccg ccagcttcaa 6 0 
gcaggcttcc acctaraccc caggrtgctt 12 0 
aggacgtttt tgaagaaggt cgcgggtaaa 18 0 
gagtgtacct gcttcttgga gccagccgag 24 0 
gaggcctatg aaggttctga ggtcgaccca 300 
acttatgccg tccatgggca ccagcttgag 360 
gatattgccg ctcgagcctc ccgactaacg 420 
cgcttagagt gccgcactgt acttggtaat 480 
gcccatcttg aagcgaatgg ccctgaggag 54 0 
tctatggggg ccgggtcgca cagcctcact 600 
aagatttcat ctaatggtct ggattgcact 66 0 
gccgcgccgg gggaggtggc cgccttctgc 72 0 

734 



<210> 119 
<211> 22 
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<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> JE us2-579-a2 
<400> 119 

cagggttggc agccttagca gc 22 



<210> 120 
<211> 483 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> US2-482 
<400> 120 



gtgtccccgg 


ctctggcaag 


tcaaggtcca 


tgcccacccg 


ggagctccgt 


aacagctggc 


acacagcggc 


ccgtgttact 


atcggccgcc 


caccgcacct 


gctgctgtta 


cacatgcagc 


caaaccagat 


tcctgctatt 


gattttgagc 


agcttgcgcc 


aacgagctgg 


tggcacgtta 


tcatacgtgg 


ggcctacccc 


aaaattcaga 


ggaacgaacc 


ggccatcggc 


caaaagttgg 



ctg 



tacaacaggg agatgtcgat gtggtggttg 6 0 
gtcgccgggg ttttgcggcc ttcacacctc 12 0 
gcgttgtgat tgatgaggct ccatctctcc 180 
gggcctcctc ggtccatctc cttggtgatc 240 
atgccggcct ggtccccgcg atccgccccg 300 
cacaccgttg cccggccgat gtgtgcgagc 360 
ccacgagccg tgtgctacgg tccctgtttt 42 0 
tttttacgca ggctgctaag gctgccaacc 480 

483 



<210> 121 
<211> 24 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> JE 2600s 
<400> 121 

taacccaaag aggcttgagg ctgc 24 



<210> 122 
<211> 22 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> us2 -482-al 
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<400> 122 

ccgctgtgtg aggtgtgaag gc 22 



<210> 123 
<211> 23 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> us2-482-a2 
<400> 123 

gacgccagct gttacggagc tec 23 



<210> 124 
<211> 431 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> US2-430 
<400> 124 

taacccaaag aggcttgagg ctgcgtaccg ggaaacttgc tcccgtcgtg gcaccgctgc 6 0 
ctacccgctt ttgggctcgg gtatatacca ggtccctgtt agectcagtt ttgatgcctg 120 
ggaacgcaat caccgccccg gegatgaget ttacttgaca gagcccgccg cagcctggtt 180 
tgaggctaat aagccggcgc agccggcgct tactataact gaggacaegg cccgtacggc 24 0 
caacctggca ttagagattg atgccgccac agaggttggc cgtgcttgtg ccggctgcac 300 
catcagcccc gggattgtgc actatcagtt taccgccggg gtcccgggct caggcaagtc 36 0 
aaggtccata caacagggag atgtcgatgt ggtggttgtg cccacccggg agetcegtaa 420 
cagctggcgt c 431 



<210> 125 
<211> 22 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> us2-orf2/3 si 
<400> 125 

egtegtcgat ctgccccagc tg 22 
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<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> HEVConsORF2-al 
<400> 126 

cttgttcrtg ytggttrtca taatc 25 



<210> 127 
<211> 21 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> us2-orf2/3 s2 
<400> 127 

cgctgactgc cgtgtcaccg g 21 



<210> 128 
<211> 25 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> HEVConsORF2-a2 
<400> 128 

gttcrtgytg gttrtcataa tcctg 25 



<210> 129 
<211> 1020 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> US2-1019 



<400> 129 
cgctgactgc 


cgtgtcaccg 


gctcctgaca 


cagcccctgt 


acctgatgtt 


gactcacgtg 


60 


gtgctattct 


gcgccggcag 


tacaatttgt 


ccacgtcccc 


gctcacgtca 


tctgtcgctt 


120 


cgggtactaa 


tttggtcctc 


tatgctgccc 


cgctgaatcc 


cctcttgcct 


ctccaggatg 


180 


gtaccaacac 


tcatattatg 


gctactgagg 


catccaatta 


tgcccagtat 


cgggttgttc 


240 


gagctacaat 


ccgttatcgc 


ccgctggtgc 


cgaatgccgt 


tggtggctat 


gccatttcca 


300 


tttctttctg 


gccccaaact 


acaactaccc 


ctacttctgt 


cgatatgaat 


tctattactt 


360 


ccacygatgt 


taggattttg 


gttcagcccg 


gtattgcctc 


cgagctagtc 


atccccagtg 


420 
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agcgccttca ttaccgtaat caaggctggc gctctgttga gaccacgggt gtggctgagg 480 
aggaggctac ttccggtctg gtaatgcttt gcattcatgg ctctcctgtt aattcctaca 54 0 
ctaatacacc ttacactggt gcgctggggc ttcttgattt tgcactagag cttgaattta 600 
ggaatttgac acccgggaac accaacaccc gtgtttcccg gtataccagc acagcccgcc 660 
accggctgcg ccgtggtgct gatgggactg ctgagcttac taccacagca gccacacgtt 72 0 
tcatgaagga cctgcacttc gctggcacga atggcgttgg tgaggtgggt cgtggtatcg 78 0 
ccctgacact gttcaatctc gctgatacgc ttctcggcgg tttaccgaca gaattgattt 84 0 
cgtcggctgg gggccaactg ttttactccc gcccggttgt ctcagccaat ggcgagccaa 9 00 
cagtaaagtt atatacatct gttgagaatg cgcagcaaga caagggcatc accattccac 96 0 
atgatataga cctgggtgac tcccgtgtgg ttatccagga ttatgataac cagcaygaac 1020 



<210> 130 
<211> 24 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> us2 330sl 
<400> 130 

cagctgatgt tgcagaggct atgg 24 



<210> 131 
<211> 24 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> us2 563al 
<400> 131 

gcaggctgat ggaaacaagg atgg 24 



<210> 132 
<211> 407 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> US2-406 
<400> 132 

cagctgatgt tgcagaggct atggcccgcc atgggatgac acgcttatac gccgcactgc 6 0 
accttccccc cgaggtgctg ctaccacccg gcacctacca cacaacctcg tacctcttga 120 
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ttcacgatgg caaccgcgct gttgtaactt acgagggcga tactagtgcg ggctataatc 180 
atgatgtctc catacttcgt gcatggatcc gtactactaa aatagttggt gaccatccat 240 
tggtcataga gcgagtgcgg gccattgggt gtcattttgt gctgctgctc accgcagccc 300 
ctgaaccgtc acctatgcct tatgttccct accctcgttc aacggaggtg tatgtccggt 360 
ctatatttgg ccctggcggc tccccatcct tgtttccatc agcctgc 407 



<210> 133 
<211> 22 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> US2-579 si 
<400> 133 

cagaccacga gccgtgtgct ac 22 



<210> 134 
<211> 23 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> US2-1168 al 
<400> 134 

ccacaagcgt tgagagaaca gcc 23 



<210> 135 
<211> 22 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> US2-579 s2 
<400> 135 

gctgctaagg ctgccaaccc tg 22 



<210> 136 
<211> 547 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> us2-579wb 
<400> 136 



62/140 



gctgctaagg ctgccaaccc tggtgcgatt acggttcacg aagctcaggg tgctactttc 6 0 
acggagacca caattatagc cacggccgac gctaggggcc tcattcagtc atcccgggcc 12 0 
catgctatag tcgcactcac ccgccatact gagaagtgtg ttattttgga tgcccccggc 180 
ttgttgcgcg aggtcggcat ttcggatgtt attgtcaata actttttcct tgccggtgga 24 0 
gaggtcggcc atcaccgccc ttctgtgata cctcgcggca atcctgatca gaacctcggg 300 
actctacagg cctttccgcc gtcatgtcag atcagtgctt accatcagtt ggctgaggaa 36 0 
ctaggtcatc gcccggcccc tgtcgccgcc gtcttgcccc cttgccctga gcttgagcag 42 0 
ggcctgctct atatgccaca agaacttact gtgtccgata gcgtgctggt ttttgagctt 480 
acggatatag tccactgccg tatggccgcc ccaagccagc gaaaggctgt tctctcaacg 54 0 
cttgtgg 547 



<210> 137 
<211> 24 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> US2-733S1 
<400> 137 

cacagcctca cttatgagct cacc 24 



<210> 138 
<211> 23 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> us2-430al 
<400> 138 

cggtgattgc gttcccaggc ate 2 3 



<210> 139 
<211> 26 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> us2-733s2 
<400> 139 

ctgcaggtaa agatttcatc taatgg 26 
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<210> 140 
<211> 24 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> us2-430a2 
<400> 140 

ccaggcatca aaactgaggc taac 

<210> 141 
<211> 903 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> US2-851 
<400> 141 

ctgcaggtaa agatttcatc taatggtctg 
gcccctagcg ccgcgccggg ggaggtggcs 
aggttcaccc agcggcattc gctgacaggc 
ggtatcttcc ccccattctc ccctgggcat 
gaggggactt tgtatacccg aacctggtca 
cctgaggcgg ccgctcctgc ttcggctgcc 
gttagtgata tctgggtgtt accaccgccc 
gtaccctctg ttcctgagcc tgctggattg 
ccccctcctc ccgtgcgtaa gccggcaaca 
tacacctacc ccgacggcgc caaggtgtat 
tggttagtca atgcctcaaa ccctggccat 
tatcaacgtt tcccagaagc gttctactcg 
gcatacactt taaccccgcg ccctattatc 
caaaacccga agaggcttga ggcagcgtac 
gcctacccgc ttttgggctc gggtatatac 
tgg 



<210> 142 
<211> 24 
<212> DNA 



gattgcactg ccacattccc cccyggtggc 6 0 
gccttctgca gtgctcttta tagatacaat 12 0 
ggactatggc tacatcctga ggggctgctg 18 0 
atttgggagt ctgctaaccc cttttgcggt 24 0 
acctctggtt tttctagtga tttctccccc 300 
gccccggggt tgccctaccc tactccacct 36 0 
tcagaggaat ctcatgttga tgcggcatct 42 0 
accagcccta ttgtgcttac cccccccccc 480 
tccccgcctc cccgcactcg ccgtctcctt 540 
gcggggtcat tgtktgagtc agactgtgat 6 00 
cgccccgggg gtggcctctg ccatgctttt 660 
actgaattca tcatgcgcga gggccttgca 720 
catgcagtgg ctcccgacta tagggttgag 78 0 
cgggaaactt gctcccgtcg tggcaccgct 84 0 
caggtccctg ttagcctcag ttttgatgcc 900 

903 



<213> Hepatitis E virus 
<220> 

<223> us2-1168sl 
<400> 142 

gcaggtctgt gttgatgttg tgtc 



<210> 143 
<211> 21 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> us2-dforf2/3 a2 
<400> 143 

ccggtgacac ggcagtcagc g 

<210> 144 
<211> 25 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> us2-1168s2 
<400> 144 

gatgttgtgt cccgtgtcta tggag 



<210> 145 
<211> 22 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> us2 dforf2/3 a3 
<400> 145 

cagctggggc agatcgacga eg 



<210> 146 
<211> 503 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> US2-502 
<400> 146 

gatgttgtgt cccgtgtcta tggagttagc 
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24 



21 



25 



22 



cccgggctgg tacataacct tattggcatg 6 0 



ctgcagacca ttgctgatgg caaggcccac tttacagara atattaaacc tgtgcttgac 12 0 
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yy y y y 


f- era ahaarst 


crfc ctt t fccrca 


tcgcccatgg 


180 




uy tut i-ciyy y 


ctattctatt 


gttgctcttc 


gtgcttt tgc 


ctatgctgcc 


240 


cgcgccaccg 


gccggccagc 


cgtctggccg 


ccgtcgtggg 


cggcgcagcg 


gcggtgccgg 


300 


cggtggtttc 


tggggtgaca 


gggttgattc 


tcagcccttc 


gccctcccct 


atattcatcc 


360 


aaccaacccc 


ttcgccgccg 


atgtcgtttc 


acaacccggg 


gctggaactc 


gccctcgaca 


420 


gccgccccgc 


ccccttggyt 


ccgcttggcg 


tgaccagtcc 


cagcgcccct 


ccgctgcccc 


480 


ccgtcgtcga 


tctgccccag 


ctg 








503 



<210> 147 
<211> 24 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> HEVConsORFl-sl 
<400> 147 

ctggcatyac tactgcyatt gage 



<210> 148 
<211> 23 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<2 23> HEVConsORFl-al 
<400> 148 

ccatcrarrc agtaagtgcg gtc 



<210> 149 
<211> 418 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> us2-orfl 
<400> 149 

ctggcattac tactgetatt gagcaggctg ctctggctgc ggctaattcc gecttggega 6 0 
atgctgtggt ggttcggccg tttctttctc gtgtgcaaac tgagattctt attaatttga 12 0 
tgcaaccccg gcagttggtc ttccgccctg aggtgctttg gaatcatcct atecageggg 18 0 
ttatacataa tgaattagag cagtactgcc gggcccgggc tggtcgttgt ttggaggttg 24 0 
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gagcccaccc gaggtccatt aatgacaacc 
cggtcggccg agatgttcag cgctggtatt 
gccgccgctc cgcgttgcgt ggtctccccc 



ctaatgtctt gcataggtgt tttcttagac 300 
ctgcccctac ccgtggtcct gcggccaatt 360 
ctgtcgaccg cacttactgt tttgatgg 418 



<210> 150 
<211> 24 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> HEVConsORF2-sl 
<400> 150 

gacagaattr atttcgtcgg ctgg 24 



<210> 151 
<211> 197 
<212> DNA 

<213> Hepatitis E virus 



<220> 

<223> us2-orf2 



<400> 151 

gacagaattg atttcgtcgg ctgggggcca 
caatggcgag ccaacagtaa agttatatac 
catcaccatt ccacatgata tagacctggg 
taaccagcay gagcaag 



actgttttac tcccgcccgg ttgtctcagc 6 0 
atctgttgag aatgcgcagc aagacaaggg 12 0 
tgactcccgt gtggttatcc aggattatga 18 0 

197 



<210> 152 
<211> 22 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<22 3> HEVConsORF2-s2 
<400> 152 

gtygtctcrg ccaatggcga gc 22 



<210> 153 
<211> 901 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> us2-3p 
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<400> 153 

gttgtctcag ccaatggcga gccaacagta 
caagacaagg gcatcaccat tccacatgat 
caggattatg ataaccagca ygagcaagac 
cccttctcag ttcttcgtgc caatgatgtt 
gaccagacta cgtatgggtc gtccaccaac 
gttaatgtgg ctactggtgc tcaggctgtt 
ctggacggcc gcccccttac taccattcag 
ctccgcggga agctgtcctt ttgggaggct 
tataatacta ccgctagtga ccaaattttg 
atttccacct ataccactag cttaggtgcc 
ctggctccac actctgccct tgccgttctt 
catacttttg atgatttttg cccggagtgc 
cagtctacta ttgctgagct ccagcgttta 
taattaattc cttctgtgcc cccttcgtag 
ttccgcgctc cctggaaaaa aaaaaaaaaa 
c 

<210> 154 
<211> 27 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> us2-gap si 
<400> 154 

tatagataac aataggttca cccagcg 

<210> 155 
<211> 25 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> us2-gap al 



aagttatata catctgttga gaatgcgcag 60 
atagacctgg gtgactcccg tgtggttatc 120 
cgacctactc cgtcacctgc cccctctcgc 180 
ttgtggcttt ccctcactgc cgctgagtat 240 
cctatgtatg tctctgacac agttacgctt 3 00 
gcccgctccc ttgattggtc taaagttact 360 
cagtattcta agacatttta tgttctcccg 420 
ggcacgacta aggccggcta cccttacaat 480 
attgagaatg cggccggcca ccgtgtcgct 54 0 
ggtcctacct cgatctctgc ggtcggcgta 600 
gaggatacta ttgattaccc cgcccgtgcc 66 0 
cgtaccctag gtttgcaggg ttgtgcattc 720 
aaaatgaagg taggtaaaac ccgggagtct 780 
tttctttcgc ttttatttct tatttctgct 840 
aaaaaaaaaa agtactagtc gacgcgtggc 900 

901 



<400> 155 

attcagtcga gtagaacgct tctgg 



25 
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<210> 156 
<211> 23 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> us2-gap s2 
<400> 156 

cggactatgg ctacatcctg agg 2 3 



<210> 157 
<211> 26 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> us2-gap a2 
<400> 157 

ttgactaacc aatcacagtc tgactc 26 



<210> 158 
<211> 462 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> 13906-gap 



<400> 158 
cggactatgg 


ctacatcctg 


aggggctgct 


gggtatcttc 


cccccattct 


cccctgggca 


60 


tatttgggag 


tctgctaacc 


ccttttgcgg 


tgaggggact 


ttgtataccc 


gaacctggtc 


120 


aacctctggt 


ttttctagtg 


atttctcccc 


ccctgaggcg 


gccgctcctg 


cttcggctgc 


180 


cgccccgggg 


ttgccctacc 


ctactccacc 


tgttagtgat 


atctgggtgt 


taccaccgcc 


240 


ctcagaggaa 


tctcatgttg 


atgcggcatc 


tgtaccctct 


gttcctgagc 


ctgctggatt 


300 


gaccagccct 


attgtgctta 


cccccccccc 


cccccctcct 


cccgtgcgta 


agccggcaac 


360 


atccccgcct 


ccccgcactc 


gccgtctcct 


ttacacctac 


cccgacggcg 


ccaaggtgta 


420 


tgcggggtca 


ttgtttgagt 


cagactgtga 


ttggttagtc 


aa 




462 



<210> 159 
<211> 21 
<212> DNA 

<213> Hepatitis E virus 



<220> 



<223> us-575a 



<400> 159 

gccgggtggt agcagcacct c 



<210> 160 
<211> 24 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> us-426s 
<400> 160 

cgttgtgctt ttgctgcaga gacc 



<210> 161 
<211> 22 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> us-84a 
<400> 161 

gaaacggccg aaccaccaca gc 

<210> 162 
<211> 24 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> us-484s 
<400> 162 

cagctgatgt tgcagaggct atgg 



<210> 163 
<211> 22 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> us-78a 
<400> 163 

gccgaaccac cacagcattc gc 



<210> 164 
<211> 7277 
<212> DNA 
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<213> Hepatitis E virus 
<220> 

<223> us2full 
<400> 164 

tcgacagggg gcagaccacg tatgtggtcg 
ctcctggcat tactactgct attgagcagg 
cgaatgctgt ggtggttcgg ccgtttcttt 
tgatgcaacc ccggcagttg gtcttccgcc 
gggttataca taatgaatta gagcagtact 
ttggagccca cccgaggtcc attaatgaca 
gaccggtcgg ccgagatgtt cagcgctggt 
attgccgccg ctccgcgttg cgtggtctcc 
gattttcccg ttgtgctttt gctgcagaga 
tttggccagc tgatgttgca gaggctatgg 
cactgcacct tccccccgag gtgctgctac 
tcttgattca cgatggcaac cgcgctgttg 
ataatcatga tgtctccata cttcgtgcat 
atccattggt catagagcga gtgcgggcca 
cagcccctga accgtcacct atgccttatg 
tccggtctat atttggccct ggcggctccc 
aatctacctt tcatgctgtc ccggttcaca 
ccctgracga tcaggcgttc tgctgttcac 
ataaggtcac tgtcggtgcg cttgtcgcta 
ttactgcagt gatcactgcg gcctatctga 
aggcgatttc caagggcatg cgccggttgg 
gactctacag ctggctattt gagaagtctg 
aattttatgc acaatgccga cggtggcttt 
ttgtctttga tgaatcagtg ccatgccgtt 
aattctgctg ttttatgcgg tggctggggc 
agggtttagt tggtgatcaa ggtcatgaca 



atgccatgga ggcccatcag ttcattaagg 6 0 
ctgctctggc tgcggctaat tccgccttgg 12 0 
ctcgtgtgca aactgagatt cttattaatt 18 0 
ctgaggtgct ttggaatcat cctatccagc 240 
gccgggcccg ggctggtcgt tgtttggagg 3 00 
accctaatgt cttgcatagg tgttttctta 360 
attctgcccc tacccgtggt cctgcggcca 420 
cccctgtcga ccgcacctat tgttttgatg 480 
ccggtgtggc cctttactct ttgcatgacc 540 
cccgccatgg gatgacacgc ttatacgccg 6 00 
cacccggcac ctaccacaca acctcgtacc 66 0 
taacttacga gggcgatact agtgcgggct 72 0 
ggatccgtac tactaaaata gttggtgacc 780 
ttgggtgtca ttttgtgctg ctgctcaccg 840 
ttccctaccc tcgttcaacg gaggtgtatg 900 
catccttgtt tccatcagcc tgctctacta 960 
tctgggatcr gctcatgctc tttggtgcca 1020 
ggcttatgac ttacctccgt ggtattagtt 1080 
atgaggggtg gaacgcctct gaggatgctc 114 0 
ccatctgcca tcagcgttac cttcgcaccc 1200 
aggttgagca tgctcagaaa tttatcacaa 126 0 
gccgtgacta catccccggc cgccagcttc 1320 
ctgcaggctt ccacctarac cccaggrtgc 13 80 
gcaggacgtt tttgaagaag gtcgcgggta 1440 
aggagtgtac ctgcttcttg gagccagccg 1500 
acgaggccta tgaaggttct gaggtcgacc 156 0 
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ccccctgagg 
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cccccccctc 
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gattggttag 


ggggtggcct 
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taac tgagga 


cacggcccgt 


dtyy CCaaCU 


ttggccgtgc 


ttgtgccggc 


tgcaccatca 


ccggggtccc 


gggctcaggc 


aagtcaaggt 


ttgtgcccac 


ccgggagctc 


egtaacaget 


ctcacacagc 


ggcccgtgtt 


actatcggcc 


tcccaccgca 


cctgctgctg 


ttacacatgc 



ggacttatgc cgtccatggg caccagcttg 1620 
atgatattgc cgctcgagcc tcccgactaa 1680 
acegcttaga gtgccgcact gtacttggta 1740 
gcgcccatct tgaagcgaat ggecctgagg 18 00 
agtctatggg ggcegggteg cacagcctca 1860 
taaagatttc atctaatggt ctggattgca 1920 
gcgccgcgcc gggggaggtg gcsgccttct 1980 
cccagcggca ttegctgaca ggeggactat 2 040 
tccccccatt ctcccctggg catatttggg 2100 
ctttgtatac ccgaacctgg tcaacctctg 2160 
cggccgctcc tgettegget gccgccccgg 2220 
atatctgggt gttaccaccg ccctcagagg 228 0 
ctgttcctga gcctgctgga ttgaccagcc 2340 
ctcccgtgcg taagceggea acatccccgc 24 00 
accccgacgg cgccaaggtg tatgeggggt 2460 
tcaatgcctc aaaccctggc catcgccccg 2520 
gtttcccaga agegttctae tcgactgaat 2580 
ctttaacccc gcgccctatt atecatgeag 2640 
cgaagaggct tgaggcagcg tacegggaaa 2 7 00 
cgcttttggg ctegggtata taccaggtcc 2 760 
gcaatcaccg ccccggcgat gagctttact 2 82 0 
etaataagee ggcgcagccg gegcttacta 2880 
tggcattaga gattgatgee gecacagagg 2940 
gccccgggat tgtgcactat cagtttaccg 3 000 
ccatacaaca gggagatgtc gatgtggtgg 3060 
ggcgtcgccg gggttttgcg gccttcacac 312 0 
gccgcgttgt gattgatgag gctccatctc 3180 
agcgggcctc ctcggtccat ctccttggtg 3 24 0 
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atccaaacca gattcctgct attgattttg 
ccgagcttgc gccaacgagc tggtggcacg 
agctcatacg tggggcctac cccaaaattc 
tttggaacga accggccatc ggccaaaagt 
accctggtgc gattacggtt cacgaagctc 
tagccacggc cgacgctagg ggcctcattc 
tcacccgcca tactgagaag tgtgttattt 
gcatttcgga tgttattgtc aataactttt 
gcccttctgt gatacctcgc ggcaatcctg 
cgccgtcatg tcagatcagt gcttaccatc 
cccctgtcgc cgccgtcttg cccccttgcc 
cacaagaact tactgtgtcc gatagcgtgc 
gccgtatggc cgccccaagc cagcgaaagg 
gccgtaggac taaattatat gaggcggcgc 
ttatccccac catcgggcct gttcgggcta 
ccatggtaga gaagggtcag gacggatctg 
acgtctcgcg catcacattt ttccaaaagg 
tcgcccatgg caaggttggc cagggcatat 
ttggcccgtg gttccgcgcc attgaaaagg 
tttatggcga cgcctatgag gagtcagtgt 
gtatggtatt tgaaaatgac ttctcagagt 
gccttgagtg tgtggttatg gaggagtgcg 
atctggtccg gtcagcctgg attttgcagg 
agaagcactc tggtgagcct ggtacccttc 
tagcacattg ctaygagttc cgtgactttc 
tggtcctctg tagtgactac cgacagrgcc 
ggctcaaatt gaaggttgat taccgcccta 
ccggtttggg gacactgccc gatgtggtgc 
ggggccctgg cccggagcgt gctgagcagc 



agcatgccgg cctggtcccc gcgatccgcc 3300 
ttacacaccg ttgcccggcc gatgtgtgcg 3360 
agaccacgag ccgtgtgcta cggtccctgt 3420 
tggtttttac gcaggctgct aaggctgcca 3480 
agggtgctac tttcacggag accacaatta 3 540 
agtcatcccg ggcccatgct atagtcgcac 3600 
tggatgcccc cggcttgttg cgcgaggtcg 3 660 
tccttgccgg tggagaggtc ggccatcacc 3 720 
atcagaacct cgggactcta caggcctttc 3780 
agttggctga ggaactaggt catcgcccgg 384 0 
ctgagcttga gcagggcctg ctctatatgc 3 900 
tggtttttga gcttacggat atagtccact 3960 
ctgttctctc aacgcttgtg gggaggtacg 4 020 
attcagatgt ccgtgagtcc ctagcgaggt 4080 
ccacatgtga gctgtacgag ctggttgaag 4140 
ccgtcctaga gctcgacctt tgcaatcgtg 42 00 
attgcaataa gtttacaact ggtgagacta 4260 
cggcctggag caagaccttc tgtgctctgt 4320 
aaatattggc cctactcccg cctaatatct 4380 
ttgctgccgc tgtgtccggg gcagggtcat 4440 
ttgacagtac ccagaataat ttctctctcg 4500 
gcatgcccca atggttaatt aggttgtacc 4560 
cgccgaagga gtctcttaag gggttttgga 4 620 
tctggaacac tgtctggaac atggcgatta 4680 
gtgttgccgc cttcaagggt gatgattcag 474 0 
gtaacgcggc tgccttaatt gcaggctgtg 4800 
tcgggctata tgctggagtg gtggtggccc 4860 
gttttgccgg tcggttatct gagaagaatt 4 920 
tgcgtcttgc tgtttgtgat ttccttcgag 4980 
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ggctactgag 


cgagctacaa 


tccgttatcg 


cccgctggtg 


atttctttct 
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actaatacac 
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gccgtggtgc 


tgatgggact 
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acctgcactt 


cgctggcacg 


gccctgacac 


tgttcaatct 


cgctgatacg 
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atgttgtgtc ccgtgtctat ggagttagcc 5040 
tgcagaccat tgctgatggc aaggcccact 5100 
ttacaaattc catcatacaa cgggtggaat 516 0 
atcaccatgc gccctagggc tgttctgttg 5220 
gcgccaccgg ccggccagcc gtctggccgc 5280 
ggtggtttct ggggtgacag ggttgattct 5340 
accaacccct tcgccgccga tgtcgtttca 5400 
ccgccccgcc cccttggytc cgcttggcgt 546 0 
cgtcgtcgat ctgccccagc tggggctgcg 552 0 
acagcccctg tacctgatgt tgactcacgt 5580 
tccacgtccc cgctcacgtc atctgtcgct 5640 
ccgctgaatc ccctcttgcc tctccaggat 5700 
gcatccaatt atgcccagta tcgggttgtt 5760 
ccgaatgccg ttggtggcta tgccatttcc 5820 
cctacttctg tcgatatgaa ttctattact 5880 
ggtattgcct ccgagctagt catccccagt 5 940 
cgctctgttg agaccacggg tgtggctgag 6 000 
tgcattcatg gctctcctgt taattcctac 6060 
cttcttgatt ttgcactaga gcttgaattt 6120 
cgtgtttccc ggtataccag cacagcccgc 618 0 
gctgagctta ctaccacagc agccacacgt 624 0 
aatggcgttg gtgaggtggg tcgtggtatc 63 00 
cttctcggcg gtttaccgac agaattgatt 636 0 
cgcccggttg tctcagccaa tggcgagcca 6420 
gcgcagcaag acaagggcat caccattcca 64 80 
gttatccagg attatgataa ccagcaygag 6540 
tctcgcccct tctcagttct tcgtgccaat 6600 
gagtatgacc agactacgta tgggtcgtcc 66 60 



74/140 



accaacccta tgtatgtctc tgacacagtt acgcttgtta atgtggctac tggtgctcag 6 72 0 
gctgttgccc gctcccttga ttggtctaaa gttactctgg acggccgccc ccttactacc 6780 
attcagcagt attctaagac attttatgtt ctcccgctcc gcgggaagct gtccttttgg 6 840 
gaggctggca cgactaaggc cggctaccct tacaattata atactaccgc tagtgaccaa 6 900 
attttgattg agaatgcggc cggccaccgt gtcgctattt ccacctatac cactagctta 6960 
ggtgccggtc ctacctcgat ctctgcggtc ggcgtactgg ctccacactc tgcccttgcc 702 0 
gttcttgagg atactattga ttaccccgcc cgtgcccata cttttgatga tttttgcccg 7080 
gagtgccgta ccctaggttt gcagggttgt gcattccagt ctactattgc tgagctccag 714 0 
cgtttaaaaa tgaaggtagg taaaacccgg gagtcttaat taattccttc tgtgccccct 7200 
tcgtagtttc tttcgctttt atttcttatt tctgctttcc gcgctccctg gaaaaaaaaa 7260 
aaaaaaaaaa aaaaaaa 7277 



<210> 165 
<211> 7277 
<212> DNA 

<213> Hepatitis E virus 

<220> 

<221> CDS 

<222> (36) . . (5162) 

<223> orfl 

<220> 
<221> CDS 

<222> (5197) . . (7179) 
<223> orf2 

<220> 

<221> misc_f eature 
<222> (5159) . . (5527) 
<223> CDS- orf3 

<220> 

<223> us2full 
<400> 165 

tcgacagggg gcagaccacg tatgtggtcg atgcc atg gag gcc cat cag ttc 53 

Met Glu Ala His Gin Phe 
1 5 

att aag get cct ggc att act act get att gag cag get get ctg get 101 
lie Lys Ala Pro Gly Tie Thr Thr Ala lie Glu Gin Ala Ala Leu Ala 
10 15 20 

gcg get aat tec gcc ttg gcg aat get gtg gtg gtt egg ccg ttt ctt 149 
Ala Ala Asn Ser Ala Leu Ala Asn Ala Val Val Val Arg Pro Phe Leu 
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25 30 35 

tct cgt gtg caa act gag att ctt att aat ttg atg caa ccc egg cag 197 
Ser Arg Val Gin Thr Glu lie Leu lie Asn Leu Met Gin Pro Arg Gin 
40 45 50 

ttg gtc ttc cgc cct gag gtg ctt tgg aat cat cct ate cag egg gtt 24 5 
Leu Val Phe Arg Pro Glu Val Leu Trp Asn His Pro lie Gin Arg Val 
55 60 65 70 

ata cat aat gaa tta gag cag tac tgc egg gec egg get ggt cgt tgt 293 
lie His Asn Glu Leu Glu Gin Tyr Cys Arg Ala Arg Ala Gly Arg Cys 
75 80 85 

ttg gag gtt gga gec cac ccg agg tec att aat gac aac cct aat gtc 341 
Leu Glu Val Gly Ala His Pro Arg Ser lie Asn Asp Asn Pro Asn Val 
90 95 100 

ttg cat agg tgt ttt ctt aga ccg gtc ggc cga gat gtt cag cgc tgg 389 
Leu His Arg Cys Phe Leu Arg Pro Val Gly Arg Asp Val Gin Arg Trp 
105 110 115 

tat tct gec cct ace cgt ggt cct gcg gec aat tgc cgc cgc tec gcg 43 7 
Tyr Ser Ala Pro Thr Arg Gly Pro Ala Ala Asn Cys Arg Arg Ser Ala 
120 125 130 

ttg cgt ggt etc ccc cct gtc gac cgc ace tat tgt ttt gat gga ttt 485 
Leu Arg Gly Leu Pro Pro Val Asp Arg Thr Tyr Cys Phe Asp Gly Phe 
135 140 145 150 

tec cgt tgt get ttt get gca gag acc ggt gtg gec ctt tac tct ttg 533 
Ser Arg Cys Ala Phe Ala Ala Glu Thr Gly Val Ala Leu Tyr Ser Leu 
155 160 165 

cat gac ctt tgg cca get gat gtt gca gag get atg gec cgc cat ggg 581 
His Asp Leu Trp Pro Ala Asp Val Ala Glu Ala Met Ala Arg His Gly 
170 175 180 

atg aca cgc tta tac gec gca ctg cac ctt ccc ccc gag gtg ctg eta 62 9 
Met Thr Arg Leu Tyr Ala Ala Leu His Leu Pro Pro Glu Val Leu Leu 
185 190 195 

cca ccc ggc acc tac cac aca acc teg tac etc ttg att cac gat ggc 677 
Pro Pro Gly Thr Tyr His Thr Thr Ser Tyr Leu Leu lie His Asp Gly 
200 205 210 

aac cgc get gtt gta act tac gag ggc gat act agt gcg ggc tat aat 725 
Asn Arg Ala Val Val Thr Tyr Glu Gly Asp Thr Ser Ala Gly Tyr Asn 
215 220 225 230 

cat gat gtc tec ata ctt cgt gca tgg ate cgt act act aaa ata gtt 773 
His Asp Val Ser lie Leu Arg Ala Trp lie Arg Thr Thr Lys He Val 
235 240 245 

ggt gac cat cca ttg gtc ata gag cga gtg egg gec att ggg tgt cat 821 
Gly Asp His Pro Leu Val He Glu Arg Val Arg Ala He Gly Cys His 
250 255 260 
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ri 



ttt gtg ctg ctg etc acc gca gec cct gaa ccg tea cct atg cct tat 869 
Phe Val Leu Leu Leu Thr Ala Ala Pro Glu Pro Ser Pro Met Pro Tyr 
265 270 275 

gtt ccc tac cct cgt tea acg gag gtg tat gtc egg tct ata ttt ggc 917 
Val Pro Tyr Pro Arg Ser Thr Glu Val Tyr Val Arg Ser lie Phe Gly 
280 285 290 

cct ggc ggc tec cca tec ttg ttt cca tea gec tgc tct act aaa tct 965 
Pro Gly Gly Ser Pro Ser Leu Phe Pro Ser Ala Cys Ser Thr Lys Ser 
295 300 305 310 

acc ttt cat get gtc ccg gtt cac ate tgg gat erg etc atg etc ttt 1013 
Thr Phe His Ala Val Pro Val His lie Trp Asp Xaa Leu Met Leu Phe 
315 320 325 

ggt gee acc ctg rac gat cag gcg ttc tgc tgt tea egg ctt atg act 1061 
Gly Ala Thr Leu Xaa Asp Gin Ala Phe Cys Cys Ser Arg Leu Met Thr 
330 335 340 

tac etc cgt ggt att agt tat aag gtc act gtc ggt gcg ctt gtc get 1109 
Tyr Leu Arg Gly lie Ser Tyr Lys Val Thr Val Gly Ala Leu Val Ala 
345 350 355 

aat gag ggg tgg aac gee tct gag gat get ctt act gca gtg ate act 1157 
Asn Glu Gly Trp Asn Ala Ser Glu Asp Ala Leu Thr Ala Val lie Thr 
360 365 370 

gcg gec tat ctg acc ate tgc cat cag cgt tac ctt cgc acc cag gcg 12 05 
Ala Ala Tyr Leu Thr lie Cys His Gin Arg Tyr Leu Arg Thr Gin Ala 
375 380 385 390 

att tec aag ggc atg cgc egg ttg gag gtt gag cat get cag aaa ttt 1253 
lie Ser Lys Gly Met Arg Arg Leu Glu Val Glu His Ala Gin Lys Phe 
395 400 405 

ate aca aga etc tac age tgg eta ttt gag aag tct ggc cgt gac tac 1301 
lie Thr Arg Leu Tyr Ser Trp Leu Phe Glu Lys Ser Gly Arg Asp Tyr 
410 415 420 

ate ccc ggc cgc cag ctt caa ttt tat gca caa tgc cga egg tgg ctt 134 9 
lie Pro Gly Arg Gin Leu Gin Phe Tyr Ala Gin Cys Arg Arg Trp Leu 
425 430 435 

tct gca ggc ttc cac eta rac ccc agg rtg ctt gtc ttt gat gaa tea 1397 
Ser Ala Gly Phe His Leu Xaa Pro Arg Xaa Leu Val Phe Asp Glu Ser 
440 445 450 

gtg cca tgc cgt tgc agg acg ttt ttg aag aag gtc gcg ggt aaa ttc 1445 
Val Pro Cys Arg Cys Arg Thr Phe Leu Lys Lys Val Ala Gly Lys Phe 
455 460 465 470 

tgc tgt ttt atg egg tgg ctg ggg cag gag tgt acc tgc ttc ttg gag 1493 
Cys Cys Phe Met Arg Trp Leu Gly Gin Glu Cys Thr Cys Phe Leu Glu 
475 480 485 
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cca gcc gag ggt tta gtt ggt gat caa ggt cat gac aac gag gcc tat 1541 
Pro Ala Glu Gly Leu Val Gly Asp Gin Gly His Asp Asn Glu Ala Tyr 
490 495 500 

gaa ggt tct gag gtc gac cca get gag cct gca cat ctt gat gtc teg 1589 
Glu Gly Ser Glu Val Asp Pro Ala Glu Pro Ala His Leu Asp Val Ser 
505 510 515 

ggg act tat gcc gtc cat ggg cac cag ctt gag gcc etc tat agg gca 1637 
Gly Thr Tyr Ala Val His Gly His Gin Leu Glu Ala Leu Tyr Arg Ala 
520 525 530 

ctt aat gtc cca cat gat att gcc get cga gcc tec cga eta acg get 1685 
Leu Asn Val Pro His Asp lie Ala Ala Arg Ala Ser Arg Leu Thr Ala 
535 540 545 550 

act gtt gag etc gtt get agt ccg gac cgc tta gag tgc cgc act gta 1733 
Thr Val Glu Leu Val Ala Ser Pro Asp Arg Leu Glu Cys Arg Thr Val 
555 560 565 

ctt ggt aat aag acc ttc egg acg acg gtg gtt gat ggc gcc cat ctt 1781 
Leu Gly Asn Lys Thr Phe Arg Thr Thr Val Val Asp Gly Ala His Leu 
570 575 580 

gaa gcg aat ggc cct gag gag tat gtt ctg tea ttt gac gcc tct cgc 1829 
Glu Ala Asn Gly Pro Glu Glu Tyr Val Leu Ser Phe Asp Ala Ser Arg 
585 590 595 

cag tct atg ggg gcc ggg teg cac age etc act tat gag etc acc cct 18 77 
Gin Ser Met Gly Ala Gly Ser His Ser Leu Thr Tyr Glu Leu Thr Pro 
600 605 610 

gcc ggt ctg cag gta aag att tea tct aat ggt ctg gat tgc act gcc 1925 
Ala Gly Leu Gin Val Lys lie Ser Ser Asn Gly Leu Asp Cys Thr Ala 
615 620 625 630 

aca ttc ccc ccy ggt ggc gcc cct age gcc gcg ccg ggg gag gtg ges 1973 
Thr Phe Pro Xaa Gly Gly Ala Pro Ser Ala Ala Pro Gly Glu Val Xaa 
635 640 645 

gcc ttc tgc agt get ctt tat aga tac aat agg ttc acc cag egg cat 2021 
Ala Phe Cys Ser Ala Leu Tyr Arg Tyr Asn Arg Phe Thr Gin Arg His 
650 655 660 

teg ctg aca ggc gga eta tgg eta cat cct gag ggg ctg ctg ggt ate 2069 
Ser Leu Thr Gly Gly Leu Trp Leu His Pro Glu Gly Leu Leu Gly lie 
665 670 675 

ttc ccc cca ttc tec cct ggg cat att tgg gag tct get aac ccc ttt 2117 
Phe Pro Pro Phe Ser Pro Gly His lie Trp Glu Ser Ala Asn Pro Phe 
680 685 690 

tgc ggt gag ggg act ttg tat acc cga acc tgg tea acc tct ggt ttt 2165 
Cys Gly Glu Gly Thr Leu Tyr Thr Arg Thr Trp Ser Thr Ser Gly Phe 
695 700 705 710 



tct agt gat ttc tec ccc cct gag gcg gcc get cct get teg get gcc 



2213 
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Ser Ser Asp Phe Ser Pro Pro Glu Ala Ala Ala Pro Ala Ser Ala Ala 
715 720 725 

gcc ccg ggg ttg ccc tac cct act cca cct gtt agt gat ate tgg gtg 2261 
Ala Pro Gly Leu Pro Tyr Pro Thr Pro Pro Val Ser Asp lie Trp Val 
730 735 740 

tta cca ccg ccc tea gag gaa tct cat gtt gat gcg gca tct gta ccc 2309 
Leu Pro Pro Pro Ser Glu Glu Ser His Val Asp Ala Ala Ser Val Pro 
745 750 755 

tct gtt cct gag cct get gga ttg acc age cct att gtg ctt acc ccc 2357 
Ser Val Pro Glu Pro Ala Gly Leu Thr Ser Pro lie Val Leu Thr Pro 
760 765 770 

ccc ccc ccc cct cct ccc gtg cgt aag ccg gca aca tec ccg cct ccc 2405 
Pro Pro Pro Pro Pro Pro Val Arg Lys Pro Ala Thr Ser Pro Pro Pro 
775 780 785 790 

cgc act cgc cgt etc ctt tac acc tac ccc gac ggc gcc aag gtg tat 2453 
Arg Thr Arg Arg Leu Leu Tyr Thr Tyr Pro Asp Gly Ala Lys Val Tyr 
795 ' 800 805 

9 C 9 <3$9 tca tfc 9 tJct 9 a 9 tca 9 ac tgt gat tgg tta gtc aat gcc tea 25 01 
Ala Gly Ser Leu Xaa Glu Ser Asp Cys Asp Trp Leu Val Asn Ala Ser 
810 815 820 

aac cct ggc cat cgc ccc ggg ggt ggc etc tgc cat get ttt tat caa 2549 
Asn Pro Gly His Arg Pro Gly Gly Gly Leu Cys His Ala Phe Tyr Gin 
825 830 835 

cgt ttc cca gaa gcg ttc tac teg act gaa ttc ate atg cgc gag ggc 2597 
Arg Phe Pro Glu Ala Phe Tyr Ser Thr Glu Phe lie Met Arg Glu Gly 
840 845 850 

ctt gca gca tac act tta acc ccg cgc cct att ate cat gca gtg get 2645 
Leu Ala Ala Tyr Thr Leu Thr Pro Arg Pro lie lie His Ala Val Ala 
855 860 865 870 

ccc gac tat agg gtt gag caa aac ccg aag agg ctt gag gca gcg tac 26 93 
Pro Asp Tyr Arg Val Glu Gin Asn Pro Lys Arg Leu Glu Ala Ala Tyr 
875 880 885 

egg gaa act tgc tec cgt cgt ggc acc get gcc tac ccg ctt ttg ggc 2741 
Arg Glu Thr Cys Ser Arg Arg Gly Thr Ala Ala Tyr Pro Leu Leu Gly 
890 895 900 

teg ggt ata tac cag gtc cct gtt age etc agt ttt gat gcc tgg gaa 2 78 9 
Ser Gly lie Tyr Gin Val Pro Val Ser Leu Ser Phe Asp Ala Trp Glu 
905 910 915 



cgc aat cac cgc ccc ggc gat gag ctt tac ttg aca gag ccc gcc gca 2837 
Arg Asn His Arg Pro Gly Asp Glu Leu Tyr Leu Thr Glu Pro Ala Ala 
920 925 930 

gcc tgg ttt gag get aat aag ccg gcg cag ccg gcg ctt act ata act 2885 
Ala Trp Phe Glu Ala Asn Lys Pro Ala Gin Pro Ala Leu Thr lie Thr 
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935 940 945 950 

gag gac acg gcc cgt acg gcc aac ctg gca tta gag att gat gcc gcc 2933 
Glu Asp Thr Ala Arg Thr Ala Asn Leu Ala Leu Glu lie Asp Ala Ala 
955 960 965 

aca gag gtt ggc cgt get tgt gcc ggc tgc acc ate age ccc ggg att 2981 
Thr Glu Val Gly Arg Ala Cys Ala Gly Cys Thr lie Ser Pro Gly lie 
970 975 980 

gtg cac tat cag ttt acc gcc ggg gtc ccg ggc tea ggc aag tea agg 3 02 9 
Val His Tyr Gin Phe Thr Ala Gly Val Pro Gly Ser Gly Lys Ser Arg 
985 990 995 

tec ata caa cag gga gat gtc gat gtg gtg gtt gtg ccc acc egg gag 3 077 
Ser He Gin Gin Gly Asp Val Asp Val Val Val Val Pro Thr Arg Glu 
1000 1005 1010 

£f etc cgt aac age tgg cgt cgc egg ggt ttt gcg gcc ttc aca cct cac 312 5 

O Leu Arg Asn Ser Trp Arg Arg Arg Gly Phe Ala Ala Phe Thr Pro His 

42 1015 1020 1025 1030 

cn; 

rg aca gcg gcc cgt gtt act ate ggc cgc cgc gtt gtg att gat ' gag get 3173 

U Thr Ala Ala Arg Val Thr He Gly Arg Arg Val Val He Asp Glu Ala 

W. 1035 1040 1045 

^ cca tct etc cca ccg cac ctg ctg ctg tta cac atg cag egg gcc tec 3221 

5 Pro Ser Leu Pro Pro His Leu Leu Leu Leu His Met Gin Arg Ala Ser 

1050 1055 1060 



teg gtc cat etc ctt ggt gat cca aac cag att cct get att gat ttt 
Ser Val His Leu Leu Gly Asp Pro Asn Gin He Pro Ala He Asp Phe 
1065 1070 1075 



3269 



gag cat gcc ggc ctg gtc ccc gcg ate cgc ccc gag ctt gcg cca acg 3317 
Glu His Ala Gly Leu Val Pro Ala He Arg Pro Glu Leu Ala Pro Thr 
1080 1085 1090 

age tgg tgg cac gtt aca cac cgt tgc ccg gcc gat gtg tgc gag etc 3365 
Ser Trp Trp His Val Thr His Arg Cys Pro Ala Asp Val Cys Glu Leu 
1095 1100 1105 1110 

ata cgt ggg gcc tac ccc aaa att cag acc acg age cgt gtg eta egg 3413 
He Arg Gly Ala Tyr Pro Lys He Gin Thr Thr Ser Arg Val Leu Arg 
1115 1120 H25 

tee ctg ttt tgg aac gaa ccg gcc ate ggc caa aag ttg gtt ttt acg 3461 
Ser Leu Phe Trp Asn Glu Pro Ala He Gly Gin Lys Leu Val Phe Thr 
1130 H35 1140 

cag get get aag get gcc aac cct ggt gcg att acg gtt cac gaa get 3 5 09 
Gin Ala Ala Lys Ala Ala Asn Pro Gly Ala He Thr Val His Glu Ala 
1145 1150 1155 

cag ggt get act ttc acg gag acc aca att ata gcc acg gcc gac get 3 557 
Gin Gly Ala Thr Phe Thr Glu Thr Thr He He Ala Thr Ala Asp Ala 
1160 1165 H70 
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&99 99° etc a tt cag tea tec egg gec cat get ata gtc gca etc ace 3605 
Arg Gly Leu lie Gin Ser Ser Arg Ala His Ala lie Val Ala Leu Thr 
1175 1180 1185 1190 

cgc cat act gag aag tgt gtt att ttg gat gec ccc ggc ttg ttg cgc 3 653 
Arg His Thr Glu Lys Cys Val lie Leu Asp Ala Pro Gly Leu Leu Arg 
1195 1200 1205 

gag gtc ggc att teg gat gtt att gtc aat aac ttt ttc ctt gec ggt 3 701 
Glu Val Gly He Ser Asp Val He Val Asn Asn Phe Phe Leu Ala Gly 
1210 1215 1220 

gga gag gtc ggc cat cac cgc cct tct gtg ata cct cgc ggc aat cct 3 749 
Gly Glu Val Gly His His Arg Pro Ser Val He Pro Arg Gly Asn Pro 
1225 1230 1235 

O gat cag aac etc ggg act eta cag gec ttt ccg ccg tea tgt cag ate 3 797 

O Asp Gin Asn Leu Gly Thr Leu Gin Ala Phe Pro Pro Ser Cys Gin He 
; JJ 1240 1245 1250 

f rj- 

fg ; agt get tac cat cag ttg get gag gaa eta ggt cat cgc ccg gec cct 3 845 

rj Ser Ala Tyr His Gin Leu Ala Glu Glu Leu Gly His Arg Pro Ala Pro 

r Z 1255 1260 1265 1270 



gtc gee gec gtc ttg ccc cct tgc cct gag ctt gag cag ggc ctg etc 3 893 
Val Ala Ala Val Leu Pro Pro Cys Pro Glu Leu Glu Gin Gly Leu Leu 
1275 1280 1285 

tat atg cca caa gaa ctt act gtg tec gat age gtg ctg gtt ttt gag 3 941 
Tyr Met Pro Gin Glu Leu Thr Val Ser Asp Ser Val Leu Val Phe Glu 
1290 1295 1300 

ctt acg gat ata gtc cac tgc cgt atg gec gec cca age cag cga aag 3 989 
Leu Thr Asp He Val His Cys Arg Met Ala Ala Pro Ser Gin Arg Lys 
1305 1310 1315 

get gtt etc tea acg ctt gtg ggg agg tac ggc cgt agg act aaa tta 4 03 7 
Ala Val Leu Ser Thr Leu Val Gly Arg Tyr Gly Arg Arg Thr Lys Leu 
1320 1325 1330 

tat gag gcg gcg cat tea gat gtc cgt gag tec eta gcg agg ttt ate 4 08 5 
Tyr Glu Ala Ala His Ser Asp Val Arg Glu Ser Leu Ala Arg Phe He 
1335 1340 1345 1350 

ccc acc ate ggg cct gtt egg get acc aca tgt gag ctg tac gag ctg 4133 
Pro Thr lie Gly Pro Val Arg Ala Thr Thr Cys Glu Leu Tyr Glu Leu 
1355 1360 1365 

gtt gaa gee atg gta gag aag ggt cag gac gga tct gee gtc eta gag 4181 
Val Glu Ala Met Val Glu Lys Gly Gin Asp Gly Ser Ala Val Leu Glu 
1370 1375 1380 

etc gac ctt tgc aat cgt gac gtc teg cgc ate aca ttt ttc caa aag 422 9 
Leu Asp Leu Cys Asn Arg Asp Val Ser Arg He Thr Phe Phe Gin Lys 
1385 1390 1395 
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gat tgc aat aag ttt aca act ggt gag act ate gec cat ggc aag gtt 42 77 
Asp Cys Asn Lys Phe Thr Thr Gly Glu Thr lie Ala His Gly Lys Val 
1400 1405 1410 

ggc cag ggc ata teg gec tgg age aag ace ttc tgt get ctg ttt ggc 4325 
Gly Gin Gly lie Ser Ala Trp Ser Lys Thr Phe Cys Ala Leu Phe Gly 
1415 1420 1425 1430 

ccg tgg ttc cgc gec att gaa aag gaa ata ttg gec eta etc ccg cct 4373 
Pro Trp Phe Arg Ala lie Glu Lys Glu lie Leu Ala Leu Leu Pro Pro 
1435 1440 1445 

aat ate ttt tat ggc gac gee tat gag gag tea gtg ttt get gee get 4421 
Asn lie Phe Tyr Gly Asp Ala Tyr Glu Glu Ser Val Phe Ala Ala Ala 
1450 1455 1460 

gtg tec ggg gca ggg tea tgt atg gta ttt gaa aat gac ttc tea gag 4469 
Val Ser Gly Ala Gly Ser Cys Met Val Phe Glu Asn Asp Phe Ser Glu 
1465 1470 1475 

ttt gac agt ace cag aat aat ttc tct etc ggc ctt gag tgt gtg gtt 4517 
Phe Asp Ser Thr Gin Asn Asn Phe Ser Leu Gly Leu Glu Cys Val Val 
1480 1485 1490 

atg gag gag tgc ggc atg ecc caa tgg tta att agg ttg tac cat ctg 4 565 
Met Glu Glu Cys Gly Met Pro Gin Trp Leu lie Arg Leu Tyr His Leu 
1495 1500 1505 1510 

gtc egg tea gec tgg att ttg cag gcg ccg aag gag tct ctt aag ggg 4613 
Val Arg Ser Ala Trp lie Leu Gin Ala Pro Lys Glu Ser Leu Lys Gly 
1515 1520 1525 

ttt tgg aag aag cac tct ggt gag cct ggt acc ctt etc tgg aac act 4661 
Phe Trp Lys Lys His Ser Gly Glu Pro Gly Thr Leu Leu Trp Asn Thr 
1530 1535 1540 

gtc tgg aac atg gcg att ata gca cat tgc tay gag ttc cgt gac ttt 4709 
Val Trp Asn Met Ala lie lie Ala His Cys Xaa Glu Phe Arg Asp Phe 
1545 1550 1555 

cgt gtt gee gee ttc aag ggt gat gat tea gtg gtc etc tgt agt gac 4757 
Arg Val Ala Ala Phe Lys Gly Asp Asp Ser Val Val Leu Cys Ser Asp 
1560 1565 1570 

tac cga cag rgc cgt aac gcg get gee tta att gca ggc tgt ggg etc 48 05 
Tyr Arg Gin Xaa Arg Asn Ala Ala Ala Leu lie Ala Gly Cys Gly Leu 
1575 1580 1585 1590 

aaa ttg aag gtt gat tac cgc cct ate ggg eta tat get gga gtg gtg 4 853 
Lys Leu Lys Val Asp Tyr Arg Pro lie Gly Leu Tyr Ala Gly Val Val 
1595 1600 1605 

gtg gee ccc ggt ttg ggg aca ctg ecc gat gtg gtg cgt ttt gee ggt 4901 
Val Ala Pro Gly Leu Gly Thr Leu Pro Asp Val Val Arg Phe Ala Gly 
1610 1615 1620 



egg tta tct gag aag aat tgg ggc cct ggc ccg gag cgt get gag cag 



4949 
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Arg Leu Ser Glu Lys Asn Trp Gly Pro Gly Pro Glu Arg Ala Glu Gin 
1625 1630 1635 

ctg cgt ctt get gtt tgt gat ttc ctt cga ggg ttg acg aat gtt gcg 4997 
Leu Arg Leu Ala Val Cys Asp Phe Leu Arg Gly Leu Thr Asn Val Ala 
1640 1645 1650 

cag gtc tgt gtt gat gtt gtg tec cgt gtc tat gga gtt age ccc ggg 5045 
Gin Val Cys Val Asp Val Val Ser Arg Val Tyr Gly Val Ser Pro Gly 
1655 1660 1665 1670 

ctg gta cat aac ctt att ggc atg ctg cag acc att get gat gge aag 5 093 
Leu Val His Asn Leu lie Gly Met Leu Gin Thr lie Ala Asp Gly Lys 
1675 1680 1685 

gec cac ttt aca gar aat att aaa cct gtg ctt gac ctt aca aat tec 5141 
Ala His Phe Thr Xaa Asn lie Lys Pro Val Leu Asp Leu Thr Asn Ser 
1690 1695 1700 

ate ata caa egg gtg gaa tga ataacatgtc ttttgeateg cccatgggat cacc 5196 
lie lie Gin Arg Val Glu 
1705 

atg cgc cct agg get gtt ctg ttg ttg etc ttc gtg ctt ttg cct atg 5244 
Met Arg Pro Arg Ala Val Leu Leu Leu Leu Phe Val Leu Leu Pro Met 
1710 1715 1720 1725 

ctg ccc gcg cca ccg gec ggc cag ccg tct ggc cgc cgt cgt ggg egg 5292 
Leu Pro Ala Pro Pro Ala Gly Gin Pro Ser Gly Arg Arg Arg Gly Arg 
1730 1735 1740 

cgc age ggc ggt gec ggc ggt ggt ttc tgg ggt gac agg gtt gat tct 534 0 
Arg Ser Gly Gly Ala Gly Gly Gly Phe Trp Gly Asp Arg Val Asp Ser 
1745 1750 1755 

cag ccc ttc gee etc ccc tat att cat cca acc aac ccc ttc gec gee 5388 
Gin Pro Phe Ala Leu Pro Tyr lie His Pro Thr Asn Pro Phe Ala Ala 
1760 1765 1770 

gat gtc gtt tea caa ccc ggg get gga act cgc cct cga cag ccg ccc 5436 
Asp Val Val Ser Gin Pro Gly Ala Gly Thr Arg Pro Arg Gin Pro Pro 
1775 1780 1785 

cgc ccc ctt ggy tec get tgg cgt gac cag tec cag cgc ccc tec get 5484 
Arg Pro Leu Xaa Ser Ala Trp Arg Asp Gin Ser Gin Arg Pro Ser Ala 
1790 1795 1800 1805 

gee ccc cgt cgt cga tct gec cca get ggg get gcg ccg ctg act gee 5532 
Ala Pro Arg Arg Arg Ser Ala Pro Ala Gly Ala Ala Pro Leu Thr Ala 
1810 1815 1820 

gtg tea ccg get cct gac aca gee cct gta cct gat gtt gac tea cgt 5580 
Val Ser Pro Ala Pro Asp Thr Ala Pro Val Pro Asp Val Asp Ser Arg 
1825 1830 1835 

ggt get att ctg cgc egg cag tac aat ttg tec acg tec ccg etc acg 5628 
Gly Ala lie Leu Arg Arg Gin Tyr Asn Leu Ser Thr Ser Pro Leu Thr 
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1840 1845 1850 

tea tct gtc get teg ggt act aat ttg gtc etc tat get gec ccg ctg 5676 
Ser Ser Val Ala Ser Gly Thr Asn Leu Val Leu Tyr Ala Ala Pro Leu 
1855 1860 1865 

aat ccc etc ttg cct etc cag gat ggt acc aac act cat att atg get 5724 
Asn Pro Leu Leu Pro Leu Gin Asp Gly Thr Asn Thr His lie Met Ala 
1870 1875 1880 1885 

act gag gca tec aat tat gee cag tat egg gtt gtt cga get aca ate 5772 
Thr Glu Ala Ser Asn Tyr Ala Gin Tyr Arg Val Val Arg Ala Thr lie 
1890 1895 1900 

cgt tat cgc ccg ctg gtg ccg aat gee gtt ggt ggc tat gec att tec 5 82 0 
Arg Tyr Arg Pro Leu Val Pro Asn Ala Val Gly Gly Tyr Ala lie Ser 
1905 1910 1915 

att tct ttc tgg ccc caa act aca act acc cct act tct gtc gat atg 5868 
lie Ser Phe Trp Pro Gin Thr Thr Thr Thr Pro Thr Ser Val Asp Met 
1920 1925 1930 

aat tct att act tec acy gat gtt agg att ttg gtt cag ccc ggt att 5916 
Asn Ser lie Thr Ser Xaa Asp Val Arg lie Leu Val Gin Pro Gly lie 
1935 1940 1945 

gee tec gag eta gtc ate ccc agt gag cgc ctt cat tac cgt aat caa 5964 
Ala Ser Glu Leu Val lie Pro Ser Glu Arg Leu His Tyr Arg Asn Gin 
1950 1955 1960 1965 

ggc tgg cgc tct gtt gag acc acg ggt gtg get gag gag gag get act 6 012 
Gly Trp Arg Ser Val Glu Thr Thr Gly Val Ala Glu Glu Glu Ala Thr 
1970 1975 1980 

tec ggt ctg gta atg ctt tgc att cat ggc tct cct gtt aat tec tac 6060 
Ser Gly Leu Val Met Leu Cys lie His Gly Ser Pro Val Asn Ser Tyr 
1985 1990 1995 

act aat aca cct tac act ggt gcg ctg ggg ctt ctt gat ttt gca eta 6108 
Thr Asn Thr Pro Tyr Thr Gly Ala Leu Gly Leu Leu Asp Phe Ala Leu 
2000 2005 2010 

gag ctt gaa ttt agg aat ttg aca ccc ggg aac acc aac acc cgt gtt 6156 
Glu Leu Glu Phe Arg Asn Leu Thr Pro Gly Asn Thr Asn Thr Arg Val 
2015 2020 2025 

tec egg tat acc age aca gee cgc cac egg ctg cgc cgt ggt get gat 6204 
Ser Arg Tyr Thr Ser Thr Ala Arg His Arg Leu Arg Arg Gly Ala Asp 
2030 2035 2040 2045 

ggg act get gag ctt act acc aca gca gee aca cgt ttc atg aag gac 6252 
Gly Thr Ala Glu Leu Thr Thr Thr Ala Ala Thr Arg Phe Met Lys Asp 
2050 2055 2060 

ctg cac ttc get ggc acg aat ggc gtt ggt gag gtg ggt cgt ggt ate 6300 
Leu His Phe Ala Gly Thr Asn Gly Val Gly Glu Val Gly Arg Gly lie 
2065 2070 2075 
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gcc ctg aca ctg ttc aat etc get gat acg ctt etc ggc ggt tta ccg 6348 
Ala Leu Thr Leu Phe Asn Leu Ala Asp Thr Leu Leu Gly Gly Leu Pro 
2080 2085 2090 

aca gaa ttg att teg teg get ggg ggc caa ctg ttt tac tec cgc ccg 63 96 
Thr Glu Leu lie Ser Ser Ala Gly Gly Gin Leu Phe Tyr Ser Arg Pro 
2095 2100 2105 

gtt gtc tea gcc aat ggc gag cca aca gta aag tta tat aca tct gtt 6444 
Val Val Ser Ala Asn Gly Glu Pro Thr Val Lys Leu Tyr Thr Ser Val 
2110 2115 2120 2125 

gag aat gcg cag caa gac aag ggc ate ace att cca cat gat ata gac 64 92 
Glu Asn Ala Gin Gin Asp Lys Gly lie Thr lie Pro His Asp lie Asp 
2130 2135 2140 

ctg ggt gac tec cgt gtg gtt ate cag gat tat gat aac cag cay gag 6540 
D Leu Gly Asp Ser Arg Val Val lie Gin Asp Tyr Asp Asn Gin Xaa Glu 

«3 2145 2150 2155 

pri: caa gac cga cct act ccg tea cct gcc ccc tct cgc ccc ttc tea gtt 6588 

fry Gin Asp Arg Pro Thr Pro Ser Pro Ala Pro Ser Arg Pro Phe Ser Val 
2 2160 2165 2170 

.jt ctt cgt gcc aat gat gtt ttg tgg ctt tec etc act gcc get gag tat 6636 

Leu Arg Ala Asn Asp Val Leu Trp Leu Ser Leu Thr Ala Ala Glu Tyr 
f 2175 2180 2185 

fU gac cag act acg tat ggg teg tec acc aac cct atg tat gtc tct gac 6684 

fLj: Asp Gin Thr Thr Tyr Gly Ser Ser Thr Asn Pro Met Tyr Val Ser Asp 

y, 2190 2195 2200 2205 

aca gtt acg ctt gtt aat gtg get act ggt get cag get gtt gcc cgc 6732 
— Thr Val Thr Leu Val Asn Val Ala Thr Gly Ala Gin Ala Val Ala Arg 

2210 2215 2220 

tec ctt gat tgg tct aaa gtt act ctg gac ggc cgc ccc ctt act acc 6780 
Ser Leu Asp Trp Ser Lys Val Thr Leu Asp Gly Arg Pro Leu Thr Thr 
2225 2230 2235 

att cag cag tat tct aag aca ttt tat gtt etc ccg etc cgc ggg aag 6828 
lie Gin Gin Tyr Ser Lys Thr Phe Tyr Val Leu Pro Leu Arg Gly Lys 
2240 2245 2250 

ctg tec ttt tgg gag get ggc acg act aag gcc ggc tac cct tac aat 68 76 
Leu Ser Phe Trp Glu Ala Gly Thr Thr Lys Ala Gly Tyr Pro Tyr Asn 
2255 2260 2265 

tat aat act acc get agt gac caa att ttg att gag aat gcg gcc ggc 6924 
Tyr Asn Thr Thr Ala Ser Asp Gin lie Leu lie Glu Asn Ala Ala Gly 
2270 2275 2280 2285 

cac cgt gtc get att tec acc tat acc act age tta ggt gcc ggt cct 6972 
His Arg Val Ala lie Ser Thr Tyr Thr Thr Ser Leu Gly Ala Gly Pro 
2290 2295 2300 
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acc teg ate tct gcg gtc ggc gta ctg get cca cac tct gec ctt gec 7020 
Thr Ser lie Ser Ala Val Gly Val Leu Ala Pro His Ser Ala Leu Ala 
2305 2310 2315 

gtt ctt gag gat act att gat tac ccc gec cgt gec cat act ttt gat 7068 
Val Leu Glu Asp Thr lie Asp Tyr Pro Ala Arg Ala His Thr Phe Asp 
2320 2325 2330 

gat ttt tgc ccg gag tgc cgt acc eta ggt ttg cag ggt tgt gca ttc 7116 
Asp Phe Cys Pro Glu Cys Arg Thr Leu Gly Leu Gin Gly Cys Ala Phe 
2335 2340 2345 

cag tct act att get gag etc cag cgt tta aaa atg aag gta ggt aaa 7164 
Gin Ser Thr lie Ala Glu Leu Gin Arg Leu Lys Met Lys Val Gly Lys 
2350 2355 2360 2365 

acc egg gag tct taa ttaattcctt ctgtgccccc ttcgtagttt etttegcttt 7219 
Thr Arg Glu Ser 

2370 

tatttcttat ttctgettte cgcgctccct ggaaaaaaaa aaaaaaaaaa aaaaaaaa 72 77 



<210> 166 
<211> 1708 
<212> PRT 

<213> Hepatitis E virus 
<400> 166 

Met Glu Ala His Gin Phe lie Lys Ala Pro Gly lie Thr Thr Ala He 
15 10 15 

Glu Gin Ala Ala Leu Ala Ala Ala Asn Ser Ala Leu Ala Asn Ala Val 
20 25 30 

Val Val Arg Pro Phe Leu Ser Arg Val Gin Thr Glu He Leu He Asn 
35 40 45 

Leu Met Gin Pro Arg Gin Leu Val Phe Arg Pro Glu Val Leu Trp Asn 
50 55 60 

His Pro He Gin Arg Val He His Asn Glu Leu Glu Gin Tyr Cys Arg 
65 70 75 80 

Ala Arg Ala Gly Arg Cys Leu Glu Val Gly Ala His Pro Arg Ser He 
85 90 95 

Asn Asp Asn Pro Asn Val Leu His Arg Cys Phe Leu Arg Pro Val Gly 
100 105 110 

Arg Asp Val Gin Arg Trp Tyr Ser Ala Pro Thr Arg Gly Pro Ala Ala 
115 120 125 

Asn Cys Arg Arg Ser Ala Leu Arg Gly Leu Pro Pro Val Asp Arg Thr 
130 135 140 



Tyr Cys Phe Asp Gly Phe Ser Arg Cys Ala Phe Ala Ala Glu Thr Gly 
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145 150 155 160 

Val Ala Leu Tyr Ser Leu His Asp Leu Trp Pro Ala Asp Val Ala Glu 
165 170 175 

Ala Met Ala Arg His Gly Met Thr Arg Leu Tyr Ala Ala Leu His Leu 
180 185 190 

Pro Pro Glu Val Leu Leu Pro Pro Gly Thr Tyr His Thr Thr Ser Tyr 
195 200 205 

Leu Leu lie His Asp Gly Asn Arg Ala Val Val Thr Tyr Glu Gly Asp 
210 215 220 

Thr Ser Ala Gly Tyr Asn His Asp Val Ser lie Leu Arg Ala Trp lie 
225 230 235 240 

Arg Thr Thr Lys lie Val Gly Asp His Pro Leu Val lie Glu Arg Val 
245 250 255 

Arg Ala lie Gly Cys His Phe Val Leu Leu Leu Thr Ala Ala Pro Glu 
260 265 270 

Pro Ser Pro Met Pro Tyr Val Pro Tyr Pro Arg Ser Thr Glu Val Tyr 
275 280 - 285 

Val Arg Ser lie Phe Gly Pro Gly Gly Ser Pro Ser Leu Phe Pro Ser 
290 295 300 

Ala Cys Ser Thr Lys Ser Thr Phe His Ala Val Pro Val His lie Trp 
305 310 315 320 

Asp Xaa Leu Met Leu Phe Gly Ala Thr Leu Xaa Asp Gin Ala Phe Cys 
325 330 335 

Cys Ser Arg Leu Met Thr Tyr Leu Arg Gly lie Ser Tyr Lys Val Thr 
340 345 350 

Val Gly Ala Leu Val Ala Asn Glu Gly Trp Asn Ala Ser Glu Asp Ala 
355 360 365 

Leu Thr Ala Val lie Thr Ala Ala Tyr Leu Thr lie Cys His Gin Arg 
370 375 380 

Tyr Leu Arg Thr Gin Ala lie Ser Lys Gly Met Arg Arg Leu Glu Val 
385 390 395 400 

Glu His Ala Gin Lys Phe lie Thr Arg Leu Tyr Ser Trp Leu Phe Glu 
405 410 415 

Lys Ser Gly Arg Asp Tyr lie Pro Gly Arg Gin Leu Gin Phe Tyr Ala 
420 425 430 

Gin Cys Arg Arg Trp Leu Ser Ala Gly Phe His Leu Xaa Pro Arg Xaa 
435 440 445 

Leu Val Phe Asp Glu Ser Val Pro Cys Arg Cys Arg Thr Phe Leu Lys 
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450 455 460 

Lys Val Ala Gly Lys Phe Cys Cys Phe Met Arg Trp Leu Gly Gin Glu 
465 470 475 480 

Cys Thr Cys Phe Leu Glu Pro Ala Glu Gly Leu Val Gly Asp Gin Gly 
485 490 495 

His Asp Asn Glu Ala Tyr Glu Gly Ser Glu Val Asp Pro Ala Glu Pro 
500 505 510 

Ala His Leu Asp Val Ser Gly Thr Tyr Ala Val His Gly His Gin Leu 
515 520 525 

Glu Ala Leu Tyr Arg Ala Leu Asn Val Pro His Asp lie Ala Ala Arg 
530 535 540 

Ala Ser Arg Leu Thr Ala Thr Val Glu Leu Val Ala Ser Pro Asp Arg 
545 550 555 560 

Leu Glu Cys Arg Thr Val Leu Gly Asn Lys Thr Phe Arg Thr Thr Val 
565 570 575 

Val Asp Gly Ala His Leu Glu Ala Asn Gly Pro Glu Glu Tyr Val Leu 
580 585 590 

Ser Phe Asp Ala Ser Arg Gin Ser Met Gly Ala Gly Ser His Ser Leu 
595 600 605 

Thr Tyr Glu Leu Thr Pro Ala Gly Leu Gin Val Lys lie Ser Ser Asn 
610 615 620 

Gly Leu Asp Cys Thr Ala Thr Phe Pro Xaa Gly Gly Ala Pro Ser Ala 
625 630 635 640 

Ala Pro Gly Glu Val Xaa Ala Phe Cys Ser Ala Leu Tyr Arg Tyr Asn 
645 650 655 

Arg Phe Thr Gin Arg His Ser Leu Thr Gly Gly Leu Trp Leu His Pro 
660 665 670 

Glu Gly Leu Leu Gly lie Phe Pro Pro Phe Ser Pro Gly His lie Trp 
675 680 685 

Glu Ser Ala Asn Pro Phe Cys Gly Glu Gly Thr Leu Tyr Thr Arg Thr 
690 695 700 

Trp Ser Thr Ser Gly Phe Ser Ser Asp Phe Ser Pro Pro Glu Ala Ala 
705 710 715 720 

Ala Pro Ala Ser Ala Ala Ala Pro Gly Leu Pro Tyr Pro Thr Pro Pro 
725 730 735 

Val Ser Asp lie Trp Val Leu Pro Pro Pro Ser Glu Glu Ser His Val 
740 745 750 

Asp Ala Ala Ser Val Pro Ser Val Pro Glu Pro Ala Gly Leu Thr Ser 
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755 760 765 

Pro lie Val Leu Thr Pro Pro Pro Pro Pro Pro Pro Val Arg Lys Pro 
770 775 780 

Ala Thr Ser Pro Pro Pro Arg Thr Arg Arg Leu Leu Tyr Thr Tyr Pro 
785 790 795 800 

Asp Gly Ala Lys Val Tyr Ala Gly Ser Leu Xaa Glu Ser Asp Cys Asp 
805 810 815 

Trp Leu Val Asn Ala Ser Asn Pro Gly His Arg Pro Gly Gly Gly Leu 
820 825 830 

Cys His Ala Phe Tyr Gin Arg Phe Pro Glu Ala Phe Tyr Ser Thr Glu 
835 840 845 

Phe lie Met Arg Glu Gly Leu Ala Ala Tyr Thr Leu Thr Pro Arg Pro 
850 855 860 

lie lie His Ala Val Ala Pro Asp Tyr Arg Val Glu Gin Asn Pro Lys 
865 870 875 880 

Arg Leu Glu Ala Ala Tyr Arg Glu Thr Cys Ser Arg Arg Gly Thr Ala 
885 890 895 

Ala Tyr Pro Leu Leu Gly Ser Gly lie Tyr Gin Val Pro Val Ser Leu 
900 905 910 

Ser Phe Asp Ala Trp Glu Arg Asn His Arg Pro Gly Asp Glu Leu Tyr 
915 920 925 

Leu Thr Glu Pro Ala Ala Ala Trp Phe Glu Ala Asn Lys Pro Ala Gin 
930 935 940 

Pro Ala Leu Thr lie Thr Glu Asp Thr Ala Arg Thr Ala Asn Leu Ala 
945 950 955 960 

Leu Glu lie Asp Ala Ala Thr Glu Val Gly Arg Ala Cys Ala Gly Cys 
965 970 975 

Thr lie Ser Pro Gly lie Val His Tyr Gin Phe Thr Ala Gly Val Pro 
980 985 990 

Gly Ser Gly Lys Ser Arg Ser lie Gin Gin Gly Asp Val Asp Val Val 
995 1000 1005 

Val Val Pro Thr Arg Glu Leu Arg Asn Ser Trp Arg Arg Arg Gly Phe 
1010 1015 1020 

Ala Ala Phe Thr Pro His Thr Ala Ala Arg Val Thr lie Gly Arg Arg 
025 1030 1035 1040 

Val Val lie Asp Glu Ala Pro Ser Leu Pro Pro His Leu Leu Leu Leu 
1045 1050 1055 



His Met Gin Arg Ala Ser Ser Val His Leu Leu Gly Asp Pro Asn Gin 
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1060 1065 1070 

lie Pro Ala lie Asp Phe Glu His Ala Gly Leu Val Pro Ala lie Arg 
1075 1080 1085 

Pro Glu Leu Ala Pro Thr Ser Trp Trp His Val Thr His Arg Cys Pro 
1090 1095 1100 

Ala Asp Val Cys Glu Leu lie Arg Gly Ala Tyr Pro Lys He Gin Thr 
105 1110 1115 1120 

Thr Ser Arg Val Leu Arg Ser Leu Phe Trp Asn Glu Pro Ala He Gly 
1125 1130 1135 

Gin Lys Leu Val Phe Thr Gin Ala Ala Lys Ala Ala Asn Pro Gly Ala 
1140 1145 1150 

He Thr Val His Glu Ala Gin Gly Ala Thr Phe Thr Glu Thr Thr He 
1155 1160 1165 

He Ala Thr Ala Asp Ala Arg Gly Leu He Gin Ser Ser Arg Ala His 
1170 1175 1180 

Ala He Val Ala Leu Thr Arg His Thr Glu Lys Cys Val He Leu Asp 
185 1190 1195 1200 

Ala Pro Gly Leu Leu Arg Glu Val Gly He Ser Asp Val He Val Asn 
1205 1210 1215 

Asn Phe Phe Leu Ala Gly Gly Glu Val Gly His His Arg Pro Ser Val 
1220 1225 1230 

He Pro Arg Gly Asn Pro Asp Gin Asn Leu Gly Thr Leu Gin Ala Phe 
1235 1240 1245 

Pro Pro Ser Cys Gin He Ser Ala Tyr His Gin Leu Ala Glu Glu Leu 
1250 1255 1260 

Gly His Arg Pro Ala Pro Val Ala Ala Val Leu Pro Pro Cys Pro Glu 
265 1270 1275 1280 

Leu Glu Gin Gly Leu Leu Tyr Met Pro Gin Glu Leu Thr Val Ser Asp 
1285 1290 1295 

Ser Val Leu Val Phe Glu Leu Thr Asp He Val His Cys Arg Met Ala 
1300 1305 1310 

Ala Pro Ser Gin Arg Lys Ala Val Leu Ser Thr Leu Val Gly Arg Tyr 
1315 1320 1325 

Gly Arg Arg Thr Lys Leu Tyr Glu Ala Ala His Ser Asp Val Arg Glu 
1330 1335 1340 

Ser Leu Ala Arg Phe He Pro Thr He Gly Pro Val Arg Ala Thr Thr 
345 1350 1355 1360 



Cys Glu Leu Tyr Glu Leu Val Glu Ala Met Val Glu Lys Gly Gin Asp 
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1365 1370 1375 

Gly Ser Ala Val Leu Glu Leu Asp Leu Cys Asn Arg Asp Val Ser Arg 
1380 1385 1390 

lie Thr Phe Phe Gin Lys Asp Cys Asn Lys Phe Thr Thr Gly Glu Thr 
1395 1400 1405 

lie Ala His Gly Lys Val Gly Gin Gly lie Ser Ala Trp Ser Lys Thr 
1410 1415 1420 

Phe Cys Ala Leu Phe Gly Pro Trp Phe Arg Ala lie Glu Lys Glu lie 
425 1430 1435 1440 

Leu Ala Leu Leu Pro Pro Asn lie Phe Tyr Gly Asp Ala Tyr Glu Glu 
1445 1450 1455 

Ser Val Phe Ala Ala Ala Val Ser Gly Ala Gly Ser Cys Met Val Phe 
1460 1465 1470 

Glu Asn Asp Phe Ser Glu Phe Asp Ser Thr Gin Asn Asn Phe Ser Leu 
1475 1480 1485 

Gly Leu Glu Cys Val Val Met Glu Glu Cys Gly Met Pro Gin Trp Leu 
1490 1495 1500 

lie Arg Leu Tyr His Leu Val Arg Ser Ala Trp lie Leu Gin Ala Pro 
505 1510 1515 1520 

Lys Glu Ser Leu Lys Gly Phe Trp Lys Lys His Ser Gly Glu Pro Gly 
1525 1530 1535 

Thr Leu Leu Trp Asn Thr Val Trp Asn Met Ala lie lie Ala His Cys 
1540 1545 1550 

Xaa Glu Phe Arg Asp Phe Arg Val Ala Ala Phe Lys Gly Asp Asp Ser 
1555 1560 1565 

Val Val Leu Cys Ser Asp Tyr Arg Gin Xaa Arg Asn Ala Ala Ala Leu 
1570 1575 1580 

lie Ala Gly Cys Gly Leu Lys Leu Lys Val Asp Tyr Arg Pro lie Gly 
585 1590 1595 1600 

Leu Tyr Ala Gly Val Val Val Ala Pro Gly Leu Gly Thr Leu Pro Asp 
1605 1610 1615 

Val Val Arg Phe Ala Gly Arg Leu Ser Glu Lys Asn Trp Gly Pro Gly 
1620 1625 1630 

Pro Glu Arg Ala Glu Gin Leu Arg Leu Ala Val Cys Asp Phe Leu Arg 
1635 1640 1645 

Gly Leu Thr Asn Val Ala Gin Val Cys Val Asp Val Val Ser Arg Val 
1650 1655 1660 

Tyr Gly Val Ser Pro Gly Leu Val His Asn Leu lie Gly Met Leu Gin 
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665 1670 1675 1680 

Thr lie Ala Asp Gly Lys Ala His Phe Thr Xaa Asn lie Lys Pro Val 
1685 1690 1695 

Leu Asp Leu Thr Asn Ser lie lie Gin Arg Val Glu 
1700 1705 



<210> 167 
<211> 660 
<212> PRT 

<213> Hepatitis E virus 
<400> 167 

Met Arg Pro Arg Ala Val Leu Leu Leu Leu Phe Val Leu Leu Pro Met 
15 10 15 

Leu Pro Ala Pro Pro Ala Gly Gin Pro Ser Gly Arg Arg Arg Gly Arg 
20 25 30 

Arg Ser Gly Gly Ala Gly Gly Gly Phe Trp Gly Asp Arg Val Asp Ser 
35 40 45 

Gin Pro Phe Ala Leu Pro Tyr lie His Pro Thr Asn Pro Phe Ala Ala 
50 55 60 

Asp Val Val Ser Gin Pro Gly Ala Gly Thr Arg Pro Arg Gin Pro Pro 
65 70 75 80 

Arg Pro Leu Xaa Ser Ala Trp Arg Asp Gin Ser Gin Arg Pro Ser Ala 
85 90 95 

Ala Pro Arg Arg Arg Ser Ala Pro Ala Gly Ala Ala Pro Leu Thr Ala 
100 105 110 

Val Ser Pro Ala Pro Asp Thr Ala Pro Val Pro Asp Val Asp Ser Arg 
115 120 125 

Gly Ala lie Leu Arg Arg Gin Tyr Asn Leu Ser Thr Ser Pro Leu Thr 
130 135 140 

Ser Ser Val Ala Ser Gly Thr Asn Leu Val Leu Tyr Ala Ala Pro Leu 
145 150 155 160 

Asn Pro Leu Leu Pro Leu Gin Asp Gly Thr Asn Thr His lie Met Ala 
165 170 175 

Thr Glu Ala Ser Asn Tyr Ala Gin Tyr Arg Val Val Arg Ala Thr lie 
180 185 190 

Arg Tyr Arg Pro Leu Val Pro Asn Ala Val Gly Gly Tyr Ala lie Ser 
195 200 205 

lie Ser Phe Trp Pro Gin Thr Thr Thr Thr Pro Thr Ser Val Asp Met 
210 215 220 
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Asn Ser lie Thr Ser Xaa Asp Val Arg lie Leu Val Gin Pro Gly lie 
225 230 235 240 

Ala Ser Glu Leu Val lie Pro Ser Glu Arg Leu His Tyr Arg Asn Gin 
245 250 255 

Gly Trp Arg Ser Val Glu Thr Thr Gly Val Ala Glu Glu Glu Ala Thr 
260 265 270 

Ser Gly Leu Val Met Leu Cys lie His Gly Ser Pro Val Asn Ser Tyr 
275 280 285 

Thr Asn Thr Pro Tyr Thr Gly Ala Leu Gly Leu Leu Asp Phe Ala Leu 
290 295 300 

Glu Leu Glu Phe Arg Asn Leu Thr Pro Gly Asn Thr Asn Thr Arg Val 
305 310 315 320 

Ser Arg Tyr Thr Ser Thr Ala Arg His Arg Leu Arg Arg Gly Ala Asp 
325 330 335 

Gly Thr Ala Glu Leu Thr Thr Thr Ala Ala Thr Arg Phe Met Lys Asp 
340 345 350 

Leu His Phe Ala Gly Thr Asn Gly Val Gly Glu Val Gly Arg Gly lie 
355 360 365 

Ala Leu Thr Leu Phe Asn Leu Ala Asp Thr Leu Leu Gly Gly Leu Pro 
370 375 380 

Thr Glu Leu lie Ser Ser Ala Gly Gly Gin Leu Phe Tyr Ser Arg Pro 
385 390 395 400 

Val Val Ser Ala Asn Gly Glu Pro Thr Val Lys Leu Tyr Thr Ser Val 
405 410 415 

Glu Asn Ala Gin Gin Asp Lys Gly lie Thr lie Pro His Asp lie Asp 
420 425 430 

Leu Gly Asp Ser Arg Val Val lie Gin Asp Tyr Asp Asn Gin Xaa Glu 
435 440 445 

Gin Asp Arg Pro Thr Pro Ser Pro Ala Pro Ser Arg Pro Phe Ser Val 
450 455 460 

Leu Arg Ala Asn Asp Val Leu Trp Leu Ser Leu Thr Ala Ala Glu Tyr 
465 470 475 480 

Asp Gin Thr Thr Tyr Gly Ser Ser Thr Asn Pro Met Tyr Val Ser Asp 
485 490 495 

Thr Val Thr Leu Val Asn Val Ala Thr Gly Ala Gin Ala Val Ala Arg 
500 505 510 



Ser Leu Asp Trp Ser Lys Val Thr Leu Asp Gly Arg Pro Leu Thr Thr 
515 520 525 
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He Gin Gin Tyr 
530 

Leu Ser Phe Trp 
545 

Tyr Asn Thr Thr 



His Arg Val Ala 
580 

Thr Ser He Ser 
595 

Val Leu Glu Asp 
610 

Asp Phe Cys Pro 
625 

Gin Ser Thr He 



Thr Arg Glu Ser 
660 



Ser Lys Thr Phe 
535 

Glu Ala Gly Thr 
550 

Ala Ser Asp Gin 
565 

He Ser Thr Tyr 



Ala Val Gly Val 
600 

Thr He Asp Tyr 
615 

Glu Cys Arg Thr 
630 

Ala Glu Leu Gin 
645 



Tyr Val Leu Pro 
540 

Thr Lys Ala Gly 
555 

He Leu He Glu 
570 

Thr Thr Ser Leu 
585 

Leu Ala Pro His 



Pro Ala Arg Ala 
620 

Leu Gly Leu Gin 
635 

Arg Leu Lys Met 
650 



Leu Arg Gly Lys 



Tyr Pro Tyr Asn 
560 

Asn Ala Ala Gly 
575 

Gly Ala Gly Pro 
590 

Ser Ala Leu Ala 
605 

His Thr Phe Asp 



Gly Cys Ala Phe 
640 

Lys Val Gly Lys 
655 



<210> 168 

<211> 122 

<212> PRT 

<213> Hepatitis 

<220> 

<223> us2 orf3 

<400> 168 
Met Asn Asn Met 
1 

Gly Leu Phe Cys 
20 

His Arg Pro Ala 
35 

Val Pro Ala Val 
50 

Pro Ser Pro He 
65 

His Asn Pro Gly 



E virus 



Ser Phe Ala Ser 
5 

Cys Cys Ser Ser 



Ser Arg Leu Ala 
40 

Val Ser Gly Val 
55 

Phe He Gin Pro 
70 

Leu Glu Leu Ala 
85 



Pro Met Gly Ser 
10 

Cys Phe Cys Leu 
25 

Ala Val Val Gly 



Thr Gly Leu He 
60 

Thr Pro Ser Pro 
75 

Leu Asp Ser Arg 
90 



Pro Cys Ala Leu 
15 

Cys Cys Pro Arg 
30 

Gly Ala Ala Ala 
45 

Leu Ser Pro Ser 



Pro Met Ser Phe 
80 

Pro Ala Pro Leu 
95 



Xaa Pro Leu Gly Val Thr Ser Pro Ser Ala Pro Pro Leu Pro Pro Val 
100 105 110 
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Val Asp Leu Pro Gin Leu Gly Leu Arg Arg 
115 120 



<210> 169 
<211> 33 
<212> PRT 

<213> Hepatitis E virus 
<220> 

<223> M 4-2 
<400> 169 

Ala Asn Gin Pro Gly His Leu Ala Pro Leu Gly Glu lie Arg Pro Ser 
15 10 15 

Ala Pro Pro Leu Pro Pro Val Ala Asp Leu Pro Gin Pro Gly Leu Arg 
20 25 30 

Arg 



<210> 170 
<211> 48 
<212> PRT 

<213> Hepatitis E virus 
<220> 

<223> M 3-2e 
<400> 170 

Thr Phe Asp Tyr Pro Gly Arg Ala His Thr Phe Asp Asp Phe Cys Pro 
15 10 15 

Glu Cys Arg Ala Leu Gly Leu Gin Gly Cys Ala Phe Gin Ser Thr Val 
20 25 30 

Ala Glu Leu Gin Arg Leu Lys Val Lys Val Gly Lys Thr Arg Glu Leu 
35 40 45 



<210> 171 
<211> 33 
<212> PRT 

<213> Hepatitis E virus 
<220> 

<223> B 4-2 
<400> 171 

Ala Asn Pro Pro Asp His Ser Ala Pro Leu Gly Val Thr Arg Pro Ser 
15 10 15 
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Ala Pro Pro Leu Pro His Val Val Asp Leu Pro Gin Leu Gly Pro Arg 
20 25 30 

Arg 



<210> 172 
<211> 48 
<212> PRT 

<213> Hepatitis E virus 
<220> 

<223> B 3-2e 
<400> 172 

Thr Leu Asp Tyr Pro Ala Arg Ala His Thr Phe Asp Asp Phe Cys Pro 
15 10 15 

Glu Cys Arg Pro Leu Gly Leu Gin Gly Cys Ala Phe Gin Ser Thr Val 
20 25 30 

Ala Glu Leu Gin Arg Leu Lys Met Lys Val Gly Lys Thr Arg Glu Leu 
35 40 45 



<210> 173 
<211> 33 
<212> PRT 

<213> Hepatitis E virus 
<220> 

<223> ORF3 (u4.2) 
<400> 173 

Asp Ser Arg Pro Ala Pro Ser Val Pro Leu Gly Val Thr Ser Pro Ser 
15 10 15 

Ala Pro Pro Leu Pro Pro Val Val Asp Leu Pro Gin Leu Gly Leu Arg 
20 25 30 

Arg 



<210> 174 
<211> 48 
<212> PRT 

<213> Hepatitis E virus 



<220> 

<223> 0RF2 (u3.2e) 
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<400> 174 

Thr Val Asp Tyr Pro Ala Arg Ala 
1 5 

Glu Cys Arg Thr Leu Gly Leu Gin 
20 

Ala Glu Leu Gin Arg Leu Lys Met 
35 40 



His Thr Phe Asp Asp Phe Cys Pro 
10 15 

Gly Cys Ala Phe Gin Ser Thr lie 
25 30 

Lys Val Gly Lys Thr Arg Glu Ser 
45 



<210> 175 
<211> 33 
<212> PRT 

<213> Hepatitis E virus 
<220> 

<223> US 4-2 
<400> 175 

Asp Ser Arg Pro Ala Pro Ser Val Pro Leu Gly Val Thr Ser Pro Ser 
15 10 15 

Ala Pro Pro Leu Pro Pro Val Val Asp Leu Pro Gin Leu Gly Leu Arg 
20 25 30 

Cys 



<210> 176 
<211> 48 
<212> PRT 

<213> Hepatitis E virus 
<220> 

<223> US 3-2e 
<400> 176 

Thr Val Asp Tyr Pro Ala Arg Ala His Thr Phe Asp Asp Phe Cys Pro 
15 10 15 

Glu Cys Arg Thr Leu Gly Val Gin Gly Cys Ala Phe Gin Ser Thr lie 
20 25 30 

Ala Glu Val Gin Arg Leu Lys Met Lys Val Gly Lys Thr Arg Glu Val 
35 40 45 



<210> 177 
<211> 21 
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<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> HEVConsORFl-s2 
<400> 177 

ctgccytkgc gaatgctgtg g 21 



<210> 178 
<211> 24 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> HEVConsORFl-a2 
<400> 178 

ggcagwrtac carcgctgaa catc 24 



<210> 179 
<211> 294 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> zl2-orfl (G.S.) 
<400> 179 

tggcattact actgccattg agcaagctgc tctggctgcg gccaattctg ccttggcgaa 6 0 
tgctgtggtg gttcggccgt ttttatctcg tttacagact gagattctta ttaatttgat 120 
gcaaccccga cagttggtct ttcgacctga ggtgttctgg aaccatccca tccaacgtgt 180 
tatacataat gaattggagc agtactgccg ggcccgggcc ggtcgctgtc tggaaattgg 240 
agcccatcca aggtcaatca atgataatcc taatgttctg catcggtgtt tcct 294 



<210> 180 
<211> 418 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> zl2-orfl.con 
<400> 180 

ctggcattac tactgctatt gagcaagctg ctctgggtgc ggccaattct gccttggcga 60 
atgctgtggt ggttcggccg tttttatctc gtttacagac tgagattctt attaatttga 12 0 
tgcaaccccg acagttggtc tttcgacctg aggtgttctg gaaccatccc atccaacgtg 18 0 
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ttatacataa tgaattggag cagtactgcc 
gagcccatcc aaggtcaatc aatgataatc 
cggtcgggag ggacgttcag cgctggtact 
gccgccggtc tgcgctgcgt ggtctccccc 



gggcccgggc cggtcgctgt ctggaaattg 24 0 
ctaatgttct gcatcggtgc tttttacgac 300 
ccgcccccac ccgtggcccc gcggccaact 360 
ctgtcgaccg cacttactgc ctcgatgg 418 



<210> 181 
<211> 197 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> zl2-orf 2 . con 
<400> 181 

gacagaatta atttcgtcgg ctgggggtca 
caatggcgag ccgactgtca agttatacac 
gatagctatt ccacatgaca tagatttggg 
taaccaacac gaacaag 



actgttctac tcccgccctg tcgtctcagc 6 0 
atctgttgag aatgcacagc aggataaggg 12 0 
cgactctcgt ttggtaatcc aggattatga 18 0 

197 



<210> 182 
<211> 25 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> HEVConsORF2/3-sl 
<400> 182 

gtatcggkyk gaatgaataa catgt 25 



<210> 183 
<211> 25 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> HEVConsORF2/3-al 
<400> 183 

aggggttggt tggatgaata taggg 25 



<210> 184 
<211> 234 
<212> DNA 

<213> Hepatitis E virus 



<220> 
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<223> zl2-orf 23 . con 
<400> 184 

gtatcggktt gaatgaataa catgttttgt gcatcgccca tgggatcacc atgcgcccta 6 0 

gggttgttct gttgttgttc ctcgtgtttc tgcctatgct gcccgcgcca ccggccggcc 12 0 

agycgactgg ccgccgtcgt gggcggcgca gcggcggtgc cggcggtggt ttctggggtg 180 

acagggttga ttctcagccc ttcgccctcc cctatattca tccaaccaac ccct 234 

<210> 185 
<211> 890 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> zl2-3p.race 
<400> 185 

gtcgtctcgg ccaatggcga 
caggataagg ggatagctat 
caggattacg ataatcagca 
cctttctcgg tcctccgcgc 
gaccagacta catatgggtc 
gtcaatgtgg ccacaggggc 
ctggacggcc gccctcttac 
cttcgcggga agttatcttt 
tataacacaa ctgctagtga 
atatctactt atactactag 
ttagccccac actcgagcct 
cacacttttg atgacttctg 
caatctacta tcgctgagct 
taattaattc ttcttgtgcc 
gcgctccctg gaaaaaaaaa 

<210> 186 
<211> 919 
<212> DNA 

<213> Hepatitis E virus 



gccgactgtc aagttataca 
tccacatgac atagatttgg 
cgagcaggac cggcccaccc 
taatgatgct ttgtggcttt 
gtccaccaac ccgatgtatg 
tcaggctgtc gcccgttctc 
taccatccag cagtactcta 
ttgggaggct ggcacaacta 
ccagattctg attgaaaacg 
cctgggcgcc ggccctgtgt 
tgctattctt gaagacactg 
tccggaatgc cgtgccctgg 
ccagcgtctt aaaatgaagg 
cccttcacgg ttctcgcttt 
aaaaaaaaaa gtactagtcg 



catctgttga gaatgcacag 6 0 
gcgactctcg tttggtaatc 12 0 
cttcgcccgc cccgtctcgt 180 
ctcttaccgc tgctgagtat 240 
tctcagacac tgttacattt 300 
ttgattggtc taaagttacc 360 
agacatttta tgttctccca 420 
aagccggtta cccttataat 48 0 
cggctggcca tcgtgtcgct 540 
cagtttctgc ggttggtgtg 6 00 
ttgactatcc ggcccgtgct 66 0 
gtctgcaggg gtgtgctttt 72 0 
taggcaaaac ccgggagttt 78 0 
atttctttct tctgcctccc 840 
acgcgtggcc 890 
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<220> 

<223> z!2-3p.con 
<400> 186 





gacagaatta 


atttcgtcgg 


ctgggggtca 


actgttctac 


tcccgccctg 


tcgtctcagc 


60 




caatggcgag 


ccgactgtca 


agttatacac 


atctgttgag 


aatgcacagc 


aggataaggg 


120 




gatagctatt 


ccacatgaca 


tagatttggg 


cgactctcgt 


ttggtaatcc 


aggattacga 


180 




taatcagcac 


gagcaggacc 


ggcccacccc 


ttcgcccgcc 


ccgtctcgtc 


ctttctcggt 


240 




cctccgcgct 


aatgatgctt 


tgtggctttc 


tcttaccgct 


gctgagtatg 


accagactac 


300 




atatgggtcg 


tccaccaacc 


cgatgtatgt 


ctcagacact 


gttacatttg 


tcaatgtggc 


360 




cacaggggct 


caggctgtcg 


cccgttctct 


tgattggtct 


aaagttaccc 


tggacggccg 


420 




ccctcttact 


accatccagc 


agtactctaa 


gacattttat 


gttctcccac 


ttcgcgggaa 


480 




gttatctttt 


tgggaggctg 


gcacaactaa 


agccggttac 


ccttataatt 


ataacacaac 


540 




tgcfcagtgac 


cagatt c tga 


ttgaaaacgc 


ggctggccat 


cgtgtcgcta 


tatctactta 


600 




tactactagc 


ctgggcgccg 


gccctgtgtc 


agtttctgcg 


gttggtgtgt 


tagccccaca 


660 




ctcgagcctt 


gctattcttg 


aagacactgt 


tgactatccg 


gcccgtgctc 


acacttttga 


720 




tgacttctgt 


ccggaatgcc 


gtgccctggg 


tctgcagggg 


tgtgcttttc 


aatctactat 


780 


fly. 


cgctgagctc 


cagcgtctta 


aaatgaaggt 


aggcaaaacc 


cgggagtttt 


aattaattct 


840 




tcttgtgccc 


ccttcacggt 


tctcgcttta 


tttctttctt 


ctgcctcccg 


cgctccctgg 


900 




aaaaaaaaaa 


aaaaaaaaa 










919 



<210> 187 
<211> 138 
<212> PRT 

<213> Hepatitis E virus 
<220> 

<223> zl2-orfl.pep 
<400> 187 

Gly lie Thr Thr Ala lie Glu Gin Ala Ala Leu Gly Ala Ala Asn Ser 
15 10 15 

Ala Leu Ala Asn Ala Val Val Val Arg Pro Phe Leu Ser Arg Leu Gin 
20 25 30 

Thr Glu lie Leu lie Asn Leu Met Gin Pro Arg Gin Leu Val Phe Arg 
35 40 45 

Pro Glu Val Phe Trp Asn His Pro lie Gin Arg Val lie His Asn Glu 
50 55 60 
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Leu Glu Gin Tyr Cys Arg Ala Arg Ala Gly Arg Cys Leu Glu lie Gly 
65 70 75 80 

Ala His Pro Arg Ser lie Asn Asp Asn Pro Asn Val Leu His Arg Cys 
85 90 95 

Phe Leu Arg Pro Val Gly Arg Asp Val Gin Arg Trp Tyr Ser Ala Pro 
100 105 110 

Thr Arg Gly Pro Ala Ala Asn Cys Arg Arg Ser Ala Leu Arg Gly Leu 
115 120 125 

Pro Pro Val Asp Arg Thr Tyr Cys Leu Asp 
130 135 



<210> 188 
<211> 61 
<212> PRT 

<213> Hepatitis E virus 
<220> 

<223> zl2-orf 2-5 ' .pep 
<400> 188 

Met Arg Pro Arg Val Val Leu Leu Leu Phe Leu Val Phe Leu Pro Met 
15 10 15 

Leu Pro Ala Pro Pro Ala Gly Gin Xaa Thr Gly Arg Arg Arg Gly Arg 
20 25 30 

Arg Ser Gly Gly Ala Gly Gly Gly Phe Trp Gly Asp Arg Val Asp Ser 
35 40 45 

Gin Pro Phe Ala Leu Pro Tyr lie His Pro Thr Asn Pro 
50 55 60 



<210> 189 
<211> 276 
<212> PRT 

<213> Hepatitis E virus 
<220> 

<223> zl2-orf 2-3 ' .pep 
<400> 189 

Thr Glu Leu lie Ser Ser Ala Gly Gly Gin Leu Phe Tyr Ser Arg Pro 
15 10 15 

Val Val Ser Ala Asn Gly Glu Pro Thr Val Lys Leu Tyr Thr Ser Val 
20 25 30 



Glu Asn Ala Gin Gin Asp Lys Gly lie Ala lie Pro His Asp lie Asp 
35 40 45 
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Leu Gly Asp Ser Arg Leu Val lie Gin Asp Tyr Asp Asn Gin His Glu 
50 55 60 

Gin Asp Arg Pro Thr Pro Ser Pro Ala Pro Ser Arg Pro Phe Ser Val 
65 70 75 80 

Leu Arg Ala Asn Asp Ala Leu Trp Leu Ser Leu Thr Ala Ala Glu Tyr 
85 90 95 

Asp Gin Thr Thr Tyr Gly Ser Ser Thr Asn Pro Met Tyr Val Ser Asp 
100 105 110 

Thr Val Thr Phe Val Asn Val Ala Thr Gly Ala Gin Ala Val Ala Arg 
115 120 125 

Ser Leu Asp Trp Ser Lys Val Thr Leu Asp Gly Arg Pro Leu Thr Thr 
130 135 140 

lie Gin Gin Tyr Ser Lys Thr Phe Tyr Val Leu Pro Leu Arg Gly Lys 
145 150 155 160 

Leu Ser Phe Trp Glu Ala Gly Thr Thr Lys Ala Gly Tyr Pro Tyr Asn 
165 170 175 

Tyr Asn Thr Thr Ala Ser Asp Gin lie Leu lie Glu Asn Ala Ala Gly 
180 185 190 

His Arg Val Ala lie Ser Thr Tyr Thr Thr Ser Leu Gly Ala Gly Pro 
195 200 205 

Val Ser Val Ser Ala Val Gly Val Leu Ala Pro His Ser Ser Leu Ala 
210 215 220 

lie Leu Glu Asp Thr Val Asp Tyr Pro Ala Arg Ala His Thr Phe Asp 
225 230 235 240 

Asp Phe Cys Pro Glu Cys Arg Ala Leu Gly Leu Gin Gly Cys Ala Phe 
245 250 255 

Gin Ser Thr lie Ala Glu Leu Gin Arg Leu Lys Met Lys Val Gly Lys 
260 265 270 

Thr Arg Glu Phe 

275 



<210> 190 
<211> 74 
<212> PRT 

<213> Hepatitis E virus 
<220> 

<223> zl2-orf3.pep 
<400> 190 

Met Asn Asn Met Phe Cys Ala Ser Pro Met Gly Ser Pro Cys Ala Leu 
15 10 15 



103/140 



Gly Leu Phe Cys Cys Cys Ser Ser Cys Phe Cys Leu Cys Cys Pro Arg 
20 25 30 

His Arg Pro Ala Ser Arg Leu Ala Ala Val Val Gly Gly Ala Ala Ala 
35 40 45 

Val Pro Ala Val Val Ser Gly Val Thr Gly Leu lie Leu Ser Pro Ser 
50 55 60 

Pro Ser Pro lie Phe lie Gin Pro Thr Pro 
65 70 



<210> 191 
<211> 408 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> pJOorf 3-2 9 . seq 





<400> 191 
gaattcatga 


ataacatgtc 


ttttgcatcg 


cccatgggat 


caccatgcgc 


cctagggctg 


60 


H 


ttctgttgtt 


gctcttcgtg 


cttttgccta 


tgctgcccgc 


gccaccggcc 


agccagccgt 


120 




ctggccgccg 


tcgtgggcgg 


cgcagcggcg 


gtgccggcgg 


tggtttctgg 


ggtgacaggg 


180 




ttgattctca 


gcccttcgcc 


ctcccctata 


ttcatccaac 


caaccccttc 


gccgccgatg 


240 




tcgtttcaca 


acccggggct 


ggaactcgcc 


ctcgacagcc 


gccccgcccc 


cttggctccg 


300 




cttggcgtga 


ccagtcccag 


cgcccctccg 


ctgccccccg 


tcgtcgatct 


gccccagctt 


360 




ggtctgcgcc 


gcgactacaa 


ggacgacgat 


gacaagtaat 


aaggatcc 




408 



<210> 192 
<211> 1026 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> cksorf 2m-2 . seq 
<400> 192 

gaattcatgg gtgctgatgg gactgctgag cttactacca cagcagccac acgtttcatg 6 0 
aaggacctgc acttcgctgg cacgaatggc gttggtgagg tgggtcgtgg tatcgccctg 120 
acactgttca atctcgctga tacgcttctc ggcggtttac cgacagaatt gatttcgtcg 180 
gctgggggcc aactgtttta ctcccgcccg gttgtctcag ccaatggcga gccaacagta 240 
aagttatata catctgttga gaatgcgcag caagacaagg gcatcaccat tccacatgat 300 
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atagacctgg 


gtgactcccg 


tgtggttatc 


caggattatg 


ataaccagca 


tgagcaagac 


360 


cgacctactc 


cgtcacctgc 


cccctctcgc 


cccttctcag 


ttcttcgtgc 


caatgatgtt 


420 


ttgtggcttt 


ccctcactgc 


cgctgagtat 


gaccagacta 


cgtatgggtc 


gtccaccaac 


480 


cctatgtatg 


tctctgacac 


agttacgctt 


gttaatgtgg 


ctactggtgc 


tcaggctgtt 


540 


gcccgctccc 


ttgattggtc 


taaagttact 


ctggacggcc 


gcccccttac 


taccattcag 


600 


cagtattcta 


agacatttta 


tgttctcccg 


ctccgcggga 


agctgtcctt 


ttgggaggct 


660 


ggcacgacta 


aggccggcta 


cccttacaat 


tataatacta 


ccgctagtga 


ccaaattttg 


720 


attgagaatg 


cggccggcca 


ccgtgtcgct 


atttccacct 


ataccactag 


cttaggtgcc 


780 


ggtcctacct 


cgatctctgc 


ggtcggcgta 


ctggctccac 


actctgccct 


tgccgttctt 


840 


gaggatacta 


ttgattaccc 


cgcccgtgcc 


catacttttg 


atgatttttg 


cccggagtgc 


900 


cgtaccctag 


gtttgcaggg 


ttgtgcattc 


cagtctacta 


ttgctgagct 


ccagcgttta 


960 


aaaatgaagg 


taggtaaaac 


ccgggagtct 


gactacaagg 


acgacgatga 


caagtaataa 


1020 


ggatcc 
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<210> 193 

<211> 1389 

<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> CKSORF32M-3 .seq 

<400> 193 



gaattcatga 


ataacatgtc 


ttttgcatcg 


cccatgggat 


caccatgcgc 


cctagggctg 


60 


ttctgttgtt 


gctcttcgtg 


cttttgccta 


tgctgcccgc 


gccaccggcc 


agccagccgt 


120 


ctggccgccg 


tcgtgggcgg 


cgtagcggcg 


gtgccggcgg 


tggtttctgg 


ggtgacaggg 


180 


ttgattctca 


gcccttcgcc 


ctcccctata 


ttcatccaac 


caaccccttc 


gccgccgatg 


240 


tcgtttcaca 


acccggggct 


ggaactcgcc 


ctcgacagcc 


gccccgcccc 


cttggctccg 


300 


cttggcgtga 


ccagtcccag 


cgcccctccg 


ctgccccccg 


tcgtcgatct 


gccccagctt 


360 


ggtctgcgcc 


gcggtgctga 


tgggactgct 


gagcttacta 


ccacagcagc 


cacacgtttc 


420 


atgaaggacc 


tgcacttcgc 


tggcacgaat 


ggcgttggtg 


aggtgggtcg 


tggtatcgcc 


480 


ctgacactgt 


tcaatctcgc 


tgatacgctt 


ctcggcggtt 


taccgacaga 


attgatttcg 


540 


tcggctgggg 


gccaactgtt 


ttactcccgc 


ccggttgtct 


cagccaatgg 


cgagccaaca 


600 


gtaaagttat 


atacatctgt 


tgagaatgcg 


cagcaagaca 


agggcatcac 


cattccacat 


660 
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gatatagacc 


tgggtgactc 


ccgtgtggtt 


atccaggatt 


atgataacca 


gcatgagcaa 


72 0 


gaccgaccta 


ctccgtcacc 


tgccccctct 


cgccccttct 


cagttcttcg 


tgccaatgat 


780 


gttttgtggc 


tttccctcac 


tgccgctgag 


tatgaccaga 


ctacgtatgg 


gtcgtccacc 


840 


aaccctatgt 


atgtctctga 


cacagttacg 


cttgttaatg 


tggctactgg 


tgctcaggct 


900 


gttgcccgct 


cccttgattg 


gtctaaagtt 


actctggacg 


gccgccccct 


tactaccatt 


960 


cagcagtatt 


ctaagacatt 


ttatgttctc 


ccgctccgcg 


ggaagctgtc 


cttttgggag 


1020 


gctggcacga 


ctaaggccgg 


ctacccttac 


aattataata 


ctaccgctag 


tgaccaaatt 


1080 


ttgattgaga 


atgcggccgg 


ccaccgtgtc 


gctatttcca 


cctataccac 


tagcttaggt 


1140 


gccggtccta 


cctcgatctc 


tgcggtcggc 


gtactggctc 


cacactctgc 


ccttgccgtt 


1200 


cttgaggata 


ctattgatta 


ccccgcccgt 


gcccatactt 


ttgatgattt 


ttgcccggag 


1260 


tgccgtaccc 


taggtttgca 


gggttgtgca 


ttccagtcta 


ctattgctga 


gctccagcgt 


1320 


ttaaaaatga 


aggtaggtaa 


aacccgggag 


tctgactaca 


aggacgacga 


tgacaagtaa 


1380 


taaggatcc 
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<210> 194 
<211> 408 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> plorf 3-12 . con 
<400> 194 

gaattcatga ataacatgtc ttttgcatcg cccatgggat caccatgcgc cctagggctg 6 0 
ttctgttgtt gctcttcgtg cttttgccta tgctgcccgc gccaccggcc ggccagccgt 120 
ctggccgccg tcgtgggcgg cgcagcggcg gtgccggcgg tggtttctgg ggtgacaggg 180 
ttgattctca gcccttcgcc ctcccctata ttcatccaac caaccccttc gccgccgatg 240 
tcgtttcaca acccggggct ggaactcgcc ctcgacagcc gccccgcccc cttggctccg 3 00 
cttggcgtga ccagtcccag cgcccctccg ctgccccccg tcgtcgatct gccccagctt 360 
ggtctgcgcc gcgactacaa ggacgacgat gacaagtaat aaggatcc 4 08 

<210> 195 
<211> 1026 
<212> DNA 

<213> Hepatitis E virus 
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<220> 

<223> plorf 2 . 2-6 . seg 




<400> 195 
gaattcatgg 


gtgctgatgg 


gactgctgag 


aaggacctgc 


acttcgctgg 


cacgaatggc 


acactgttca 


atctcgctga 


tacgcttctc 


gctgggggcc 


aactgtttta 


ctcccgcccg 


aagttatata 


catctgttga 


gaatgcgcag 


atagaccugg 


gtgactcccg 


tgtggttatc 


cgacctactc 


cgtcacctgc 


cccctctcgc 


ttgtggcttt 


ccctcactgc 


cgctgagtat 


cctatgtatg 


tctctgacac 


agttacgctt 


gcccgctccc 


ttgattggtc 


taaagttact 


cagtattcta 


agacatttta 


tgttctcccg 


ggcacgacta 


aggccggcta 


cccttacaat 


attgagaatg 


cggccggcca 


ccgtgtcgct 


ggtcctacct 


cgatctctgc 


ggtcggcgta 


gaggatacta 


ttgattaccc 


cgcccgtgcc 


cgtaccctag 


gtttgcaggg 


ttgtgcattc 


aaaatgaagg 


taggtaaaac 


ccgggagtct 


ggatcc 







<210> 196 
<211> 1389 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> PLORF3 2M-14-5.seq 
<400> 196 

gaattcatga ataacatgtc ttttgcatcg 
ttctgttgtt gctcttcgtg cttttgccta 
ctggccgccg tcgtgggcgg cgtagcggcg 
ttgattctca gcccttcgcc ctcccctata 



cttactacca cagcagccac acgtttcatg 60 
gttggtgagg tgggtcgtgg tatcgccctg 12 0 
ggcggtttac cgacagaatt gatttcgtcg 18 0 
gttgtctcag ccaatggcga gccaacagta 240 
caagacaagg gcatcaccat tccacatgat 3 00 
caggattatg ataaccagca tgagcaagac 360 
cccttctcag ttcttcgtgc caatgatgtt 420 
gaccagacta cgtatgggtc gtccaccaac 480 
gttaatgtgg ctactggtgc tcaggctgtt 54 0 
ctggacggcc gcccccttac taccattcag 600 
ctccgcggga agctgtcctt ttgggaggct 66 0 
tataatacta ccgctagtga ccaaattttg 72 0 
atttccacct ataccactag cttaggtgcc 780 
ctggctccac actctgccct tgccgttctt 840 
catacttttg atgatttttg cccggagtgc 900 
cagtctacta ttgctgagct ccagcgttta 960 
gactacaagg acgacgatga caagtaataa 1020 

1026 



cccatgggat caccatgcgc cctagggctg 6 0 
tgctgcccgc gccaccggcc agccagccgt 120 
gtgccggcgg tggtttctgg ggtgacaggg 180 
ttcatccaac caaccccttc gccgccgatg 240 
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tcgtttcaca 


acccggggct 


ggaactcgcc 


ctcgacagcc 


gccccgcccc 


cttggctccg 


300 


cttggcgtga 


ccagtcccag 


cgcccctccg 


ctgccccccg 


tcgtcgatct 


gccccagctt 


360 


ggtctgcgcc 


gcggtgctga 


tgggactgct 


gagcttacta 


ccacagcagc 


cacacgtttc 


420 


atgaaggacc 


tgcacttcgc 


tggcacgaat 


ggcgttggtg 


aggtgggtcg 


tggtatcgcc 


480 


ctgacactgt 


tcaatctcgc 


tgatacgctt 


ctcggcggtt 


taccgacaga 


attgatttcg 


540 


tcggctgggg 


gccaactgtt 


ttactcccgc 


ccggttgtct 


cagccaatgg 


cgagccaaca 


600 


gtaaagttat 


atacatctgt 


tgagaatgcg 


cagcaagaca 


agggcatcac 


cattccacat 


660 


gatatagacc 


tgggtgactc 


ccgtgtggtt 


atccaggatt 


atgataacca 


gcatgagcaa 


720 


gaccgaccta 


ctccgtcacc 


tgccccctct 


cgccccttct 


cagttcttcg 


tgccaatgat 


780 


gttttgtggc 


tttccctcac 


tgccgctgag 


tatgaccaga 


ctacgtatgg gtcgtccacc 


840 


aaccctatgt 


atgtctctga 


cacagttacg 


cttgttaatg 


tggctactgg 


tgctcaggct 


900 


gttgcccgct 


cccttgattg 


gtctaaagtt 


actctggacg 


gccgccccct 


tactaccatt 


960 


cagcagtatt 


ctaagacatt 


ttatgttctc 


ccgctccgcg 


ggaagctgtc 


cttttgggag 


1020 


gctggcacga 


ctaaggccgg 


ctacccttac 


aattataata 


ctaccgctag 


tgaccaaatt 


1080 


ttgattgaga 


atgcggccgg 


ccaccgtgtc 


gctatttcca 


cctataccac 


tagcttaggt 


1140 


gccggtccta 


cctcgatctc 


tgcggtcggc 


gtactggctc 


cacactctgc 


ccttgccgtt 


1200 


cttgaggata 


ctattgatta 


ccccgcccgt 


gcccatactt 


ttgatgattt 


ttgcccggag 


1260 


tgccgtaccc 


taggtttgca 


gggttgtgca 


ttccagtcta 


ctattgctga 


gctccagcgt 


1320 


ttaaaaatga 


aggtaggtaa 


aacccgggag 


tctgactaca 


aggacgacga 


tgacaagtaa 


1380 


taaggatcc 
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<210> 197 
<211> 74 
<212> PRT 

<213> Hepatitis E virus 
<220> 

<223> zl2-orf 3-5 ' .pep 
<400> 197 

Met Asn Asn Met Phe Cys Ala Ser Pro Met Gly Ser Pro Cys Ala Leu 
15 10 15 



Gly Leu Phe Cys Cys Cys Ser Ser Cys Phe Cys Leu Cys Cys Pro Arg 



108/140 



His Arg Pro Ala Xaa Arg Leu Ala Ala Val Val Gly Gly Ala Ala Ala 
35 40 45 

Val Pro Ala Val Val Ser Gly Val Thr Gly Leu lie Leu Ser Pro Ser 
50 55 60 

Pro Ser Pro lie Phe lie Gin Pro Thr Pro 
65 70 



<210> 198 
<211> 63 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> Description of Artificial Sequence: Primer orf23p 
<400> 198 

tatatggatc cttattactt gtcatcgtcg tccttgtagt cagactcccg ggttttacct 60 
acc 6 3 



<210> 199 
<211> 338 
<212> PRT 
<213> Hepatitis 

<220> 

<223> cksorf2m-: 

<400> 199 
Glu Phe Met Gly 
1 

Thr Arg Phe Met 
20 

Glu Val Gly Arg 
35 

Leu Leu Gly Gly 
50 

Leu Phe Tyr Ser 
65 

Lys Leu Tyr Thr 



lie Pro His Asp 
100 

Tyr Asp Asn Gin 
115 



E virus 



.pep 



Ala Asp Gly Thr 
5 

Lys Asp Leu His 



Gly lie Ala Leu 
40 

Leu Pro Thr Glu 
55 

Arg Pro Val Val 
70 

Ser Val Glu Asn 
85 

lie Asp Leu Gly 



His Glu Gin Asp 
120 



Ala Glu Leu Thr 
10 

Phe Ala Gly Thr 

25 

Thr Leu Phe Asn 



Leu lie Ser Ser 
60 

Ser Ala Asn Gly 
75 

Ala Gin Gin Asp 
90 

Asp Ser Arg Val 
105 

Arg Pro Thr Pro 



Thr Thr Ala Ala 
15 

Asn Gly Val Gly 
30 

Leu Ala Asp Thr 
45 

Ala Gly Gly Gin 



Glu Pro Thr Val 
80 

Lys Gly lie Thr 
95 

Val lie Gin Asp 
110 

Ser Pro Ala Pro 
125 
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Ser Arg Pro Phe Ser Val Leu Arg Ala Asn Asp Val Leu Trp Leu Ser 
130 135 140 

Leu Thr Ala Ala Glu Tyr Asp Gin Thr Thr Tyr Gly Ser Ser Thr Asn 
145 150 155 160 

Pro Met Tyr Val Ser Asp Thr Val Thr Leu Val Asn Val Ala Thr Gly 
165 170 175 

Ala Gin Ala Val Ala Arg Ser Leu Asp Trp Ser Lys Val Thr Leu Asp 
180 185 190 

Gly Arg Pro Leu Thr Thr lie Gin Gin Tyr Ser Lys Thr Phe Tyr Val 
195 200 205 

Leu Pro Leu Arg Gly Lys Leu Ser Phe Trp Glu Ala Gly Thr Thr Lys 
210 215 220 

Ala Gly Tyr Pro Tyr Asn Tyr Asn Thr Thr Ala Ser Asp Gin lie Leu 
225 230 235 240 

lie Glu Asn Ala Ala Gly His Arg Val Ala lie Ser Thr Tyr Thr Thr 
245 250 255 

Ser Leu Gly Ala Gly Pro Thr Ser lie Ser Ala Val Gly Val Leu Ala 
260 265 270 

Pro His Ser Ala Leu Ala Val Leu Glu Asp Thr lie Asp Tyr Pro Ala 
275 280 285 

Arg Ala His Thr Phe Asp Asp Phe Cys Pro Glu Cys Arg Thr Leu Gly 
290 295 300 

Leu Gin Gly Cys Ala Phe Gin Ser Thr lie Ala Glu Leu Gin Arg Leu 
305 310 315 320 

Lys Met Lys Val Gly Lys Thr Arg Glu Ser Asp Tyr Lys Asp Asp Asp 
325 330 335 

Asp Lys 



<210> 200 
<211> 338 
<212> PRT 

<213> Hepatitis E virus 
<220> 

<223> plorf 2 . 2-6 .pep 
<400> 200 

Glu Phe Met Gly Ala Asp Gly Thr Ala Glu Leu Thr Thr Thr Ala Ala 
15 10 15 



Thr Arg Phe Met Lys Asp Leu His Phe Ala Gly Thr Asn Gly Val Gly 
20 25 30 
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Glu Val Gly Arg Gly lie Ala Leu Thr Leu Phe Asn Leu Ala Asp Thr 
35 40 45 

Leu Leu Gly Gly Leu Pro Thr Glu Leu lie Ser Ser Ala Gly Gly Gin 
50 55 60 

Leu Phe Tyr Ser Arg Pro Val Val Ser Ala Asn Gly Glu Pro Thr Val 
65 70 75 80 

Lys Leu Tyr Thr Ser Val Glu Asn Ala Gin Gin Asp Lys Gly lie Thr 
85 90 95 

lie Pro His Asp lie Asp Leu Gly Asp Ser Arg Val Val lie Gin Asp 
100 105 110 

Tyr Asp Asn Gin His Glu Gin Asp Arg Pro Thr Pro Ser Pro Ala Pro 
115 120 125 

Ser Arg Pro Phe Ser Val Leu Arg Ala Asn Asp Val Leu Trp Leu Ser 
130 135 140 

Leu Thr Ala Ala Glu Tyr Asp Gin Thr Thr Tyr Gly Ser Ser Thr Asn 
145 150 155 160 

Pro Met Tyr Val Ser Asp Thr Val Thr Leu Val Asn Val Ala Thr Gly 
165 170 175 

Ala Gin Ala Val Ala Arg Ser Leu Asp Trp Ser Lys Val Thr Leu Asp 
180 185 190 

Gly Arg Pro Leu Thr Thr lie Gin Gin Tyr Ser Lys Thr Phe Tyr Val 
195 200 205 

Leu Pro Leu Arg Gly Lys Leu Ser Phe Trp Glu Ala Gly Thr Thr Lys 
210 215 220 

Ala Gly Tyr Pro Tyr Asn Tyr Asn Thr Thr Ala Ser Asp Gin lie Leu 
225 230 235 240 

lie Glu Asn Ala Ala Gly His Arg Val Ala lie Ser Thr Tyr Thr Thr 
245 250 255 

Ser Leu Gly Ala Gly Pro Thr Ser lie Ser Ala Val Gly Val Leu Ala 
260 265 270 

Pro His Ser Ala Leu Ala Val Leu Glu Asp Thr He Asp Tyr Pro Ala 
275 280 285 

Arg Ala His Thr Phe Asp Asp Phe Cys Pro Glu Cys Arg Thr Leu Gly 
290 295 300 

Leu Gin Gly Cys Ala Phe Gin Ser Thr He Ala Glu Leu Gin Arg Leu 
305 310 315 320 



Lys Met Lys Val Gly Lys Thr Arg Glu Ser Asp Tyr Lys Asp Asp Asp 
325 330 335 
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Asp Lys 



<210> 201 
<211> 37 
<212> DNA 

<213> Hepatitis E virus 



<220> 

<223> Description of Artificial Sequence: Primer orf35p 
<400> 201 

tatatgaatt catgaataac atgtcttttg catcgcc 3 7 



<210> 202 
<211> 68 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> Description of Artificial Sequence: Primer orf33p 
<400> 202 

tatatggatc cttattactt gtcatcgtcg tccttgtagt cgcggcgcag accaagctgg 60 
ggcagatc 6 8 



<210> 203 
<211> 132 
<212> PRT 

<213> Hepatitis E virus 
<220> 

<2 23> pJOorf 3-29 .pep 
<400> 203 

Glu Phe Met Asn Asn Met Ser Phe Ala Ser Pro Met Gly Ser Pro Cys 
15 10 15 

Ala Leu Gly Leu Phe Cys Cys Cys Ser Ser Cys Phe Cys Leu Cys Cys 
20 25 30 

Pro Arg His Arg Pro Ala Ser Arg Leu Ala Ala Val Val Gly Gly Ala 
35 40 45 

Ala Ala Val Pro Ala Val Val Ser Gly Val Thr Gly Leu lie Leu Ser 
50 55 60 

Pro Ser Pro Ser Pro lie Phe lie Gin Pro Thr Pro Ser Pro Pro Met 
65 70 75 80 



Ser Phe His Asn Pro Gly Leu Glu Leu Ala Leu Asp Ser Arg Pro Ala 
85 90 95 
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Pro Leu Ala Pro Leu Gly Val Thr Ser Pro Ser Ala Pro Pro Leu Pro 
100 105 110 

Pro Val Val Asp Leu Pro Gin Leu Gly Leu Arg Arg Asp Tyr Lys Asp 
115 120 125 

Asp Asp Asp Lys 
130 



<210> 204 
<211> 132 
<212> PRT 

<213> Hepatitis E virus 
<220> 

<223> plorf 3-12 .pep 
<400> 204 

Glu Phe Met Asn Asn Met Ser Phe Ala Ser Pro Met Gly Ser Pro Cys 
15 10 15 

Ala Leu Gly Leu Phe Cys Cys Cys Ser Ser Cys Phe Cys Leu Cys Cys 
20 25 30 

Pro Arg His Arg Pro Ala Ser Arg Leu Ala Ala Val Val Gly Gly Ala 
35 40 45 

Ala Ala Val Pro Ala Val Val Ser Gly Val Thr Gly Leu lie Leu Ser 
50 55 60 

Pro Ser Pro Ser Pro lie Phe lie Gin Pro Thr Pro Ser Pro Pro Met 
65 70 75 80 

Ser Phe His Asn Pro Gly Leu Glu Leu Ala Leu Asp Ser Arg Pro Ala 
85 90 95 

Pro Leu Ala Pro Leu Gly Val Thr Ser Pro Ser Ala Pro Pro Leu Pro 
100 105 110 

Pro Val Val Asp Leu Pro Gin Leu Gly Leu Arg Arg Asp Tyr Lys Asp 
115 120 125 

Asp Asp Asp Lys 
130 



<210> 205 

<211> 48 

<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> Description of Artificial Sequence: Primer orf23 



<400> 205 
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ctcagcagtc ccatcagcac cgcggcgcag accaagctgg ggcagatc 



<210> 206 
<211> 459 
<212> PRT 

<213> Hepatitis E virus 
<220> 

< 2 2 3 > CKSORF3 2M- 3 . pep 
<400> 206 

Glu Phe Met Asn Asn Met Ser Phe Ala Ser Pro Met Gly Ser Pro Cys 
15 10 15 

Ala Leu Gly Leu Phe Cys Cys Cys Ser Ser Cys Phe Cys Leu Cys Cys 
20 25 30 

Pro Arg His Arg Pro Ala Ser Arg Leu Ala Ala Val Val Gly Gly Val 
35 40 45 

Ala Ala Val Pro Ala Val Val Ser Gly Val Thr Gly Leu lie Leu Ser 
50 55 60 

Pro Ser Pro Ser Pro lie Phe lie Gin Pro Thr Pro Ser Pro Pro Met 
65 70 75 80 

Ser Phe His Asn Pro Gly Leu Glu Leu Ala Leu Asp Ser Arg Pro Ala 
85 90 95 

Pro Leu Ala Pro Leu Gly Val Thr Ser Pro Ser Ala Pro Pro Leu Pro 
100 105 110 

Pro Val Val Asp Leu Pro Gin Leu Gly Leu Arg Arg Gly Ala Asp Gly 
115 120 125 

Thr Ala Glu Leu Thr Thr Thr Ala Ala Thr Arg Phe Met Lys Asp Leu 
130 135 140 

His Phe Ala Gly Thr Asn Gly Val Gly Glu Val Gly Arg Gly lie Ala 
145 150 155 160 

Leu Thr Leu Phe Asn Leu Ala Asp Thr Leu Leu Gly Gly Leu Pro Thr 
165 170 175 

Glu Leu lie Ser Ser Ala Gly Gly Gin Leu Phe Tyr Ser Arg Pro Val 
180 185 190 

Val Ser Ala Asn Gly Glu Pro Thr Val Lys Leu Tyr Thr Ser Val Glu 
195 200 205 

Asn Ala Gin Gin Asp Lys Gly lie Thr lie Pro His Asp lie Asp Leu 
210 215 220 



Gly Asp Ser Arg Val Val lie Gin Asp Tyr Asp Asn Gin His Glu Gin 
225 230 235 240 
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Asp Arg Pro Thr Pro Ser Pro Ala Pro Ser Arg Pro Phe Ser Val Leu 
245 250 255 

Arg Ala Asn Asp Val Leu Trp Leu Ser Leu Thr Ala Ala Glu Tyr Asp 
260 265 270 

Gin Thr Thr Tyr Gly Ser Ser Thr Asn Pro Met Tyr Val Ser Asp Thr 
275 280 285 

Val Thr Leu Val Asn Val Ala Thr Gly Ala Gin Ala Val Ala Arg Ser 
290 295 300 

Leu Asp Trp Ser Lys Val Thr Leu Asp Gly Arg Pro Leu Thr Thr lie 
305 310 315 320 

Gin Gin Tyr Ser Lys Thr Phe Tyr Val Leu Pro Leu Arg Gly Lys Leu 
325 330 335 

Ser Phe Trp Glu Ala Gly Thr Thr Lys Ala Gly Tyr Pro Tyr Asn Tyr 
340 345 350 

Asn Thr Thr Ala Ser Asp Gin lie Leu lie Glu Asn Ala Ala Gly His 
355 360 365 

Arg Val Ala lie Ser Thr Tyr Thr Thr Ser Leu Gly Ala Gly Pro Thr 
370 375 380 

Ser lie Ser Ala Val Gly Val Leu Ala Pro His Ser Ala Leu Ala Val 
385 390 395 400 

Leu Glu Asp Thr lie Asp Tyr Pro Ala Arg Ala His Thr Phe Asp Asp 
405 410 415 

Phe Cys Pro Glu Cys Arg Thr Leu Gly Leu Gin Gly Cys Ala Phe Gin 
420 425 430 

Ser Thr lie Ala Glu Leu Gin Arg Leu Lys Met Lys Val Gly Lys Thr 
435 440 445 

Arg Glu Ser Asp Tyr Lys Asp Asp Asp Asp Lys 
450 455 



<210> 207 
<211> 459 
<212> PRT 

<213> Hepatitis E virus 
<220> 

<223> PLORF32M-14-5 .pep 
<400> 207 

Glu Phe Met Asn Asn Met Ser Phe Ala Ser Pro Met Gly Ser Pro Cys 
15 10 15 



Ala Leu Gly Leu Phe Cys Cys Cys Ser Ser Cys Phe Cys Leu Cys Cys 
20 25 30 
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Pro Arg His Arg Pro Ala Ser Arg Leu Ala Ala Val Val Gly Gly Val 
35 40 45 

Ala Ala Val Pro Ala Val Val Ser Gly Val Thr Gly Leu lie Leu Ser 
50 55 60 

Pro Ser Pro Ser Pro lie Phe lie Gin Pro Thr Pro Ser Pro Pro Met 
65 70 75 80 

Ser Phe His Asn Pro Gly Leu Glu Leu Ala Leu Asp Ser Arg Pro Ala 
85 90 95 

Pro Leu Ala Pro Leu Gly Val Thr Ser Pro Ser Ala Pro Pro Leu Pro 
100 105 110 

Pro Val Val Asp Leu Pro Gin Leu Gly Leu Arg Arg Gly Ala Asp Gly 
115 120 125 

Thr Ala Glu Leu Thr Thr Thr Ala Ala Thr Arg Phe Met Lys Asp Leu 
130 135 140 

His Phe Ala Gly Thr Asn Gly Val Gly Glu Val Gly Arg Gly lie Ala 
145 150 155 160 

Leu Thr Leu Phe Asn Leu Ala Asp Thr Leu Leu Gly Gly Leu Pro Thr 
165 170 175 

Glu Leu lie Ser Ser Ala Gly Gly Gin Leu Phe Tyr Ser Arg Pro Val 
180 185 190 

Val Ser Ala Asn Gly Glu Pro Thr Val Lys Leu Tyr Thr Ser Val Glu 
195 200 205 

Asn Ala Gin Gin Asp Lys Gly lie Thr lie Pro His Asp lie Asp Leu 
210 215 220 

Gly Asp Ser Arg Val Val lie Gin Asp Tyr Asp Asn Gin His Glu Gin 
225 230 235 240 

Asp Arg Pro Thr Pro Ser Pro Ala Pro Ser Arg Pro Phe Ser Val Leu 
245 250 255 

Arg Ala Asn Asp Val Leu Trp Leu Ser Leu Thr Ala Ala Glu Tyr Asp 
260 265 270 

Gin Thr Thr Tyr Gly Ser Ser Thr Asn Pro Met Tyr Val Ser Asp Thr 
275 280 285 

Val Thr Leu Val Asn Val Ala Thr Gly Ala Gin Ala Val Ala Arg Ser 
290 295 300 

Leu Asp Trp Ser Lys Val Thr Leu Asp Gly Arg Pro Leu Thr Thr lie 
305 310 315 320 



Gin Gin Tyr Ser Lys Thr Phe Tyr Val Leu Pro Leu Arg Gly Lys Leu 
325 330 335 
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Ser Phe Trp Glu 
340 

Asn Thr Thr Ala 
355 

Arg Val Ala lie 
370 

Ser lie Ser Ala 
385 

Leu Glu Asp Thr 



Phe Cys Pro Glu 
420 

Ser Thr lie Ala 
435 

Arg Glu Ser Asp 
450 



Ala Gly Thr Thr 



Ser Asp Gin lie 
360 

Ser Thr Tyr Thr 
375 

Val Gly Val Leu 
390 

lie Asp Tyr Pro 
405 

Cys Arg Thr Leu 



Glu Leu Gin Arg 
440 

Tyr Lys Asp Asp 
455 



Lys Ala Gly Tyr 
345 

Leu lie Glu Asn 



Thr Ser Leu Gly 
380 

Ala Pro His Ser 
395 

Ala Arg Ala His 
410 

Gly Leu Gin Gly 
425 

Leu Lys Met Lys 



Asp Asp Lys 



Pro Tyr Asn Tyr 
350 

Ala Ala Gly His 
365 

Ala Gly Pro Thr 



Ala Leu Ala Val 
400 

Thr Phe Asp Asp 
415 

Cys Ala Phe Gin 
430 

Val Gly Lys Thr 
445 



<210> 208 
<211> 36 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> Description of Artificial Sequence: Primer 
orf 2mid5p 

<400> 208 

tatatgaatt catgggtgct gatgggactg ctgagc 



<210> 209 
<211> 418 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> 1440ol.seq 

<220> 

<221> CDS 

<222> (3) . . (416) 

<400> 209 

ct ggc aty act act gey att gag cag get get ctg get gcg gec aat 
Gly Xaa Thr Thr Xaa lie Glu Gin Ala Ala Leu Ala Ala Ala Asn 
15 10 15 

tec gee ttg gcg aat get gtg gtg gtt egg ccg ttt tta tec cgt gtt 
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Ser Ala Leu Ala Asn Ala Val Val Val Arg Pro Phe Leu Ser Arg Val 
20 25 30 

caa act gat ate ctt att aac ctg atg caa ccc cgt cag ctt gtg ttc 143 
Gin Thr Asp lie Leu lie Asn Leu Met Gin Pro Arg Gin Leu Val Phe 
35 40 45 

egg cct gaa gtt etc tgg aac cat ccg ate cag cga gtt ata cat aat 191 
Arg Pro Glu Val Leu Trp Asn His Pro lie Gin Arg Val lie His Asn 
50 55 60 

gag ctg gaa caa tac tgt cga gec cgc get ggc cgc tgt ctt gag gtg 239 
Glu Leu Glu Gin Tyr Cys Arg Ala Arg Ala Gly Arg Cys Leu Glu Val 
65 70 75 

ggc get cac cca agg tct att aat gat aac ccc aat gtt ctg cac egg 28 7 
Gly Ala His Pro Arg Ser lie Asn Asp Asn Pro Asn Val Leu His Arg 
80 85 90 95 

tgc ttt etc cgc ccg gtt ggg aga gac gtc cag cgc tgg tat tec gec 33 5 
Cys Phe Leu Arg Pro Val Gly Arg Asp Val Gin Arg Trp Tyr Ser Ala 
100 105 110 

ccc act cgt ggt cca gcg get aac tgc cgc cgt tct gcg eta cgc ggt 383 
Pro Thr Arg Gly Pro Ala Ala Asn Cys Arg Arg Ser Ala Leu Arg Gly 
115 120 125 

ttg ccc cct gtc gac cgc act tac tgt yty gat gg 418 
Leu Pro Pro Val Asp Arg Thr Tyr Cys Xaa Asp 
130 135 



<210> 210 
<211> 138 
<212> PRT 

<213> Hepatitis E virus 
<400> 210 

Gly Xaa Thr Thr Xaa lie Glu Gin Ala Ala Leu Ala Ala Ala Asn Ser 
15 10 15 

Ala Leu Ala Asn Ala Val Val Val Arg Pro Phe Leu Ser Arg Val Gin 
20 25 30 

Thr Asp lie Leu lie Asn Leu Met Gin Pro Arg Gin Leu Val Phe Arg 
35 40 45 

Pro Glu Val Leu Trp Asn His Pro lie Gin Arg Val lie His Asn Glu 
50 55 60 

Leu Glu Gin Tyr Cys Arg Ala Arg Ala Gly Arg Cys Leu Glu Val Gly 
65 70 75 80 

Ala His Pro Arg Ser lie Asn Asp Asn Pro Asn Val Leu His Arg Cys 
85 90 95 

Phe Leu Arg Pro Val Gly Arg Asp Val Gin Arg Trp Tyr Ser Ala Pro 
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100 105 110 

Thr Arg Gly Pro Ala Ala Asn Cys Arg Arg Ser Ala Leu Arg Gly Leu 
115 120 125 

Pro Pro Val Asp Arg Thr Tyr Cys Xaa Asp 
130 135 



<210> 211 
<211> 197 
<212> DNA 

<213> Hepatitis E virus 



<220> 

<223> 1440o2.seq 

<220> 

<221> CDS 

<222> (2) . . (196) 

<400> 211 

g aca gaa ttr att teg teg get gga ggt caa ctg ttc tac tec cgc ccg 49 
Thr Glu Xaa lie Ser Ser Ala Gly Gly Gin Leu Phe Tyr Ser Arg Pro 
15 10 15 



gtt gtc tea gee aat ggc gag ccg act gtt aag tta tac ace tct gtc 
Val Val Ser Ala Asn Gly Glu Pro Thr Val Lys Leu Tyr Thr Ser Val 
20 25 30 



97 



gag aat gca cag cag gat aag ggc att get ata cca cat gat at a gac 145 
Glu Asn Ala Gin Gin Asp Lys Gly lie Ala lie Pro His Asp lie Asp 
35 40 45 

tta ggg gat tec cgt gtg gtt ata caa gat tat gay aac car cay gaa 193 
Leu Gly Asp Ser Arg Val Val lie Gin Asp Tyr Xaa Asn Xaa Xaa Glu 
50 55 60 



caa g 
Gin 
65 



<210> 212 
<211> 65 
<212> PRT 

<213> Hepatitis E virus 
<400> 212 

Thr Glu Xaa lie Ser Ser Ala Gly Gly Gin Leu Phe Tyr Ser Arg Pro 
15 10 15 

Val Val Ser Ala Asn Gly Glu Pro Thr Val Lys Leu Tyr Thr Ser Val 
20 25 30 

Glu Asn Ala Gin Gin Asp Lys Gly lie Ala lie Pro His Asp lie Asp 
35 40 45 
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Leu Gly Asp Ser Arg Val Val lie Gin Asp Tyr Xaa Asn Xaa Xaa Glu 
50 55 60 

Gin 
65 



<210> 213 
<211> 418 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> 2015-1. seq 

<220> 

<221> CDS 

<222> (3) . . (416) 

<400> 213 

ct ggc aty act act gey att gag cag get get ctg get gcg get aac 4 7 
Gly Xaa Thr Thr Xaa lie Glu Gin Ala Ala Leu Ala Ala Ala Asn 
15 10 15 

tct gee ttg gcg aat get gtg gtg gtc egg ccg ttc ctg tec cgc act 95 
Ser Ala Leu Ala Asn Ala Val Val Val Arg Pro Phe Leu Ser Arg Thr 
20 25 30 

cag act gat att ctt att aat ttg atg caa ccc egg caa ctt gta ttc 143 
Gin Thr Asp lie Leu lie Asn Leu Met Gin Pro Arg Gin Leu Val Phe 
35 40 45 

cgc cct gag gtt ttg tgg aac cat ccg ate cag cga gtc ata cat aat 191 
Arg Pro Glu Val Leu Trp Asn His Pro lie Gin Arg Val lie His Asn 
50 55 60 

gag ctg gag cag tat tgc cgt get cgt get ggt cgc tgc ctg gag gtt 23 9 
Glu Leu Glu Gin Tyr Cys Arg Ala Arg Ala Gly Arg Cys Leu Glu Val 
65 70 75 

999 get cat cca aga tct ate aat gac aac cct aat gtt ctg cac egg 2 87 
Gly Ala His Pro Arg Ser He Asn Asp Asn Pro Asn Val Leu His Arg 
80 85 90 95 

tgt ttc etc cgt ccg gtt ggg cga gac gta cag cgt tgg tat tct gee 335 
Cys Phe Leu Arg Pro Val Gly Arg Asp Val Gin Arg Trp Tyr Ser Ala 
100 105 110 

cct act cgc ggc ccg gcg get aat tgc cgc cgt tec gcg tta cgt ggc 383 
Pro Thr Arg Gly Pro Ala Ala Asn Cys Arg Arg Ser Ala Leu Arg Gly 
115 120 125 

eta cct cct gtc gac cgc act tac tgt yty gat gg 418 
Leu Pro Pro Val Asp Arg Thr Tyr Cys Xaa Asp 
130 135 
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<210> 214 
<211> 138 
<212> PRT 

<213> Hepatitis E virus 
<400> 214 

Gly Xaa Thr Thr Xaa lie Glu Gin Ala Ala Leu Ala Ala Ala Asn Ser 
15 10 15 

Ala Leu Ala Asn Ala Val Val Val Arg Pro Phe Leu Ser Arg Thr Gin 
20 25 30 

Thr Asp lie Leu lie Asn Leu Met Gin Pro Arg Gin Leu Val Phe Arg 
35 40 45 

Pro Glu Val Leu Trp Asn His Pro He Gin Arg Val He His Asn Glu 
50 55 60 

Leu Glu Gin Tyr Cys Arg Ala Arg Ala Gly Arg Cys Leu Glu Val Gly 
65 70 75 80 

Ala His Pro Arg Ser He Asn Asp Asn Pro Asn Val Leu His Arg Cys 
85 90 95 

Phe Leu Arg Pro Val Gly Arg Asp Val Gin Arg Trp Tyr Ser Ala Pro 
100 105 110 

Thr Arg Gly Pro Ala Ala Asn Cys Arg Arg Ser Ala Leu Arg Gly Leu 
115 120 125 

Pro Pro Val Asp Arg Thr Tyr Cys Xaa Asp 
130 135 



<210> 215 
<211> 197 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> 2015o2.seq 

<220> 

<221> CDS 

<222> (2) . . (196) 

<400> 215 

g aca gaa ttr att teg teg get gga ggc cag etc ttc tac tec cgc cca 4 9 
Thr Glu Xaa He Ser Ser Ala Gly Gly Gin Leu Phe Tyr Ser Arg Pro 
15 10 15 

gtc gtc tea gec aat ggc gag ccg act gtt aaa ttg tat aca tec gtc 97 
Val Val Ser Ala Asn Gly Glu Pro Thr Val Lys Leu Tyr Thr Ser Val 
20 25 30 



gag aat gcg cag cag gac aag ggc att gec ata cca cat gat ata gat 



145 
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Glu Asn Ala Gin Gin Asp Lys Gly lie Ala lie Pro His Asp lie Asp 
35 40 45 

eta gga gat tec cgc gtg gtt ate cag gat tat gay aac car cay gaa 193 
Leu Gly Asp Ser Arg Val Val lie Gin Asp Tyr Xaa Asn Xaa Xaa Glu 
50 55 60 



caa g 197 
Gin 
65 



<210> 216 
<211> 65 
<212> PRT 

<213> Hepatitis E virus 
<400> 216 

Thr Glu Xaa lie Ser Ser Ala Gly Gly Gin Leu Phe Tyr Ser Arg Pro 
15 10 15 

Val Val Ser Ala Asn Gly Glu Pro Thr Val Lys Leu Tyr Thr Ser Val 
20 25 30 

Glu Asn Ala Gin Gin Asp Lys Gly lie Ala lie Pro His Asp lie Asp 
35 40 45 

Leu Gly Asp Ser Arg Val Val lie Gin Asp Tyr Xaa Asn Xaa Xaa Glu 
50 55 60 

Gin 
65 



<210> 217 
<211> 251 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> 14404-2. seq 

<220> 

<221> CDS 

<222> (3) . . (251) 

<223> orf2 

<220> 

<223> orf3 from position 1 to position 165 
<400> 217 

at att cat cca acc aac ccc ttt gec tec gac gtc gta teg caa tec 47 
lie His Pro Thr Asn Pro Phe Ala Ser Asp Val Val Ser Gin Ser 
15 10 15 



ggg get gga get cgc cct cga cag ccg gee cgc ccc etc ggc tec tct 95 
Gly Ala Gly Ala Arg Pro Arg Gin Pro Ala Arg Pro Leu Gly Ser Ser 
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20 25 30 

tgg cgt gac cag tec cag cgc ccc ccc get gtc ccc cgt cgt cga tct 143 
Trp Arg Asp Gin Ser Gin Arg Pro Pro Ala Val Pro Arg Arg Arg Ser 
35 40 45 

acc cca act ggg get gcg ccg eta act get gtt tea cca gcg cct gat 191 
Thr Pro Thr Gly Ala Ala Pro Leu Thr Ala Val Ser Pro Ala Pro Asp 
50 55 60 

acg gec cca gtc cct gat gtt gac tct cgt ggc get ate ttg cgc egg 23 9 
Thr Ala Pro Val Pro Asp Val Asp Ser Arg Gly Ala lie Leu Arg Arg 
65 70 75 

cag tat aac eta 251 
Gin Tyr Asn Leu 
80 



<210> 218 
<211> 83 
<212> PRT 

<213> Hepatitis E virus 
<400> 218 

lie His Pro Thr Asn Pro Phe Ala Ser Asp Val Val Ser Gin Ser Gly 
15 10 15 

Ala Gly Ala Arg Pro Arg Gin Pro Ala Arg Pro Leu Gly Ser Ser Trp 
20 25 30 

Arg Asp Gin Ser Gin Arg Pro Pro Ala Val Pro Arg Arg Arg Ser Thr 
35 40 45 

Pro Thr Gly Ala Ala Pro Leu Thr Ala Val Ser Pro Ala Pro Asp Thr 
50 55 60 

Ala Pro Val Pro Asp Val Asp Ser Arg Gly Ala lie Leu Arg Arg Gin 
65 70 75 80 

Tyr Asn Leu 



<210> 219 
<211> 55 
<212> PRT 

<213> Hepatitis E virus 
<220> 

<223> 14404-2. seq orf3 
<400> 219 

lie Phe lie Gin Pro Thr Pro Leu Pro Pro Thr Ser Tyr Arg Asn Pro 
15 10 15 



Gly Leu Glu Leu Ala Leu Asp Ser Arg Pro Ala Pro Ser Ala Pro Leu 
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20 25 30 

Gly Val Thr Ser Pro Ser Ala Pro Pro Leu Ser Pro Val Val Asp Leu 
35 40 45 

Pro Gin Leu Gly Leu Arg Arg 
50 55 



<210> 220 
<211> 251 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> 20154-2. seq 

<220> 

<221> CDS 

<222> (3) . . (251) 

<223> orf2 

<220> 

<223> orf3 from position 1 to position 165 
<400> 220 

at att cat cca acc aac ccc ttt gcc gcc gac gtc gta tea caa ccc 4 7 
lie His Pro Thr Asn Pro Phe Ala Ala Asp Val Val Ser Gin Pro 
15 10 15 

ggg get gga get cgc cct cga cag ccg ccc cgc ccc etc ggc tec tct 95 
Gly Ala Gly Ala Arg Pro Arg Gin Pro Pro Arg Pro Leu Gly Ser Ser 
20 25 30 

tgg cgt gat cag tec cag cgc ccc tec get gcc ccc cgt cgt cga tct 143 
Trp Arg Asp Gin Ser Gin Arg Pro Ser Ala Ala Pro Arg Arg Arg Ser 
35 40 45 

acc cca get ggg get gcg ccg tta act get gtt tec cct gcg ccc gat 191 
Thr Pro Ala Gly Ala Ala Pro Leu Thr Ala Val Ser Pro Ala Pro Asp 
50 55 60 

acg gcc cca gtc ccc gac gtt gat tec cgt ggt gcc ate ctg cgc egg 2 39 
Thr Ala Pro Val Pro Asp Val Asp Ser Arg Gly Ala lie Leu Arg Arg 
65 70 75 

cag tat aac eta 251 
Gin Tyr Asn Leu 
80 



<210> 221 
<211> 83 
<212> PRT 

<213> Hepatitis E virus 



<400> 221 
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He His Pro Thr Asn Pro Phe Ala Ala Asp Val Val Ser Gin Pro Gly 
15 10 15 

Ala Gly Ala Arg Pro Arg Gin Pro Pro Arg Pro Leu Gly Ser Ser Trp 
20 25 30 

Arg Asp Gin Ser Gin Arg Pro Ser Ala Ala Pro Arg Arg Arg Ser Thr 
35 40 45 

Pro Ala Gly Ala Ala Pro Leu Thr Ala Val Ser Pro Ala Pro Asp Thr 
50 55 60 

Ala Pro Val Pro Asp Val Asp Ser Arg Gly Ala He Leu Arg Arg Gin 
65 70 75 80 



Tyr Asn Leu 



f f\i 



<210> 222 
<211> 55 
<212> PRT 

<213> Hepatitis E virus 
<220> 

<223> 20154-2. seq orf3 
<400> 222 

He Phe He Gin Pro Thr Pro Leu Pro Pro Thr Ser Tyr His Asn Pro 
15 10 15 

Gly Leu Glu Leu Ala Leu Asp Ser Arg Pro Ala Pro Ser Ala Pro Leu 
20 25 30 

Gly Val He Ser Pro Ser Ala Pro Pro Leu Pro Pro Val Val Asp Leu 
35 40 45 

Pro Gin Leu Gly Leu Arg Arg 
50 55 



<210> 223 
<211> 48 
<212> PRT 

<213> Hepatitis E virus 
<220> 

<223> US-2 3-2e 
<400> 223 

Thr He Asp Tyr Pro Ala Arg Ala His Thr Phe Asp Asp Phe Cys Pro 
15 10 15 

Glu Cys Arg Thr Leu Gly Leu Gin Gly Cys Ala Phe Gin Ser Thr He 
20 25 30 

Ala Glu Leu Gin Arg Leu Lys Met Lys Val Gly Lys Thr Arg Glu Ser 
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<210> 224 
<211> 33 
<212> PRT 

<213> Hepatitis E virus 
<220> 

<223> US-2 4-2 
<400> 224 

Asp Ser Arg Pro Ala Pro Leu Val Pro Leu Gly Val Thr Ser Pro Ser 
15 10 15 

Ala Pro Pro Leu Pro Pro Val Val Asp Leu Pro Gin Leu Gly Leu Arg 
20 25 30 

Arg 



<210> 225 
<211> 450 
<212> DNA 

<213> Hepatitis E virus 



<220> 

<223> 5p.pile {hpesvp} 



<400> 225 
ggctcctggc 


atcactactg 


ctattgagca 


ggcgaatgct 


gtggtagtta 


ggccttttct 


cctaatgcaa 


cctcgccagc 


ttgttttccg 


gcgtgtcatc 


cataacgagc 


tggagcttta 


aattggcgcc 


catccccgct 


caataaatga 


ccgccctgtt 


gggcgtgatg 


ttcagcgctg 


taattgccgg 


cgttccgcgc 


tgcgcgggct 


cgggttttct 


ggctgtaact 


ttcccgccga 



<210> 226 

<211> 450 

<212> DNA 

<213> Hepatitis E virus 



ggctgctcta gcagcggcca actctgccct 6 0 
ctctcaccag cagattgaga tcctcattaa 120 
ccccgaggtt ttctggaatc atcccatcca 180 
ctgccgcgcc cgctccggcc gctgtcttga 240 
taatcctaat gtggtccacc gctgcttcct 300 
gtatactgct cccactcgcg ggccggctgc 360 
tcccgctgct gaccgcactt actgcctcga 420 

450 



<220> 
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<223> Sp.pile {hpeuigh} 
<400> 226 

ggctcctggc atcactactg ctattgagca ggctgctcta gcagcggcca attctgccct 60 
tgcgaatgct gtggtagtta ggccttttct ctctcaccag cagattgaga tccttattaa 120 
cctaatgcaa cctcgccagc ttgttttccg ccccgaggtt ttctggaacc accccatcca 180 
gcgtgtcatc cataatgagc tggagcttta ctgtcgcgcc cgctccggcc gctgccttga 240 
aattggtgcc caccctcgct caataaacga caatcctaat gtggtccacc gctgcttcct 300 
ccgccctgcc gggcgtgatg ttcagcgttg gtatactgct cctacccgcg ggccggctgc 360 
taattgccgg ggttccgcac tgcgcgggct ccccgctgct gaccgcactt actgcttcga 42 0 
cgggttttct ggctgtaact ttcccgccga 4 50 

<210> 227 
<211> 450 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> Sp.pile {hpea} 
<400> 227 

ggctcctggc atcactactg ctattgagca ggctgctcta gcagcggcca actctgccct 60 
tgcgaatgct gtggtagtta ggccttttct ctctcaccag cagattgaga tccttattaa 12 0 
cctaatgcaa cctcgccagc ttgttttccg ccccgaggtt ttctggaacc atcccatcca 180 
gcgtgttatc cataatgagc tggagcttta ctgtcgcgcc cgctccggcc gctgcctcga 240 
aattggtgcc cacccccgct caataaatga caatcctaat gtggtccacc gttgcttcct 300 
ccgtcctgcc gggcgtgatg ttcagcgttg gtatactgcc cctacccgcg ggccggctgc 360 
taattgccgg cgttccgcgc tgcgcgggct ccccgctgct gaccgcactt actgcttcga 42 0 
cgggttttct ggctgtaact ttcccgccga 450 



<210> 228 
<211> 446 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> Sp.pile {840455p} 
<400> 228 

cctggcatta ctactgccat tgagcaggct gctctggctg cggccaattc tgccttggcg 6 0 
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aatgctgtgg tggttcggcc gtttttatct cgcgtgcaaa ccgagattct tattaatttg 12 0 
atgcaacccc ggcagttggt tttccgccct gaggtacttt ggaatcaccc tatccagcgg 18 0 
gttatacata atgaattaga acagtactgc cgggctcggg ctggtcgttg cttggaggtt 24 0 
ggagctcacc caagatccat taatgacaac cccaacgttc tgcatcggtg tttccttaga 300 
ccggttggcc gagatgttca gcgctggtac tctgccccca cccgcggccc tgcggctaat 360 
tgccgccgct ccgcgttgcg tggtctcccc cccgctgacc gcacttactg ctttgatgga 42 0 
ttctcccgtt gtgcttttgc tgcaga 446 



<210> 229 
<211> 450 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> Sp.pile {hpenssp} 
<400> 229 

ggctcctggc atcactactg ctattgagca agcagctcta gcagcggcca actccgccct 60 
tgcgaatgct gtggtggtcc ggcctttcct ttcccatcag caggttgaga tccttataaa 12 0 
tctcatgcaa cctcggcagc tggtgtttcg tcctgaggtt ttttggaatc acccgattca 18 0 
acgtgttata cataatgagc ttgagcagta ttgccgtgct cgctcgggtc gctgccttga 240 
gattggagcc cacccacgct ccattaatga taatcctaat gtcctccatc gctgctttct 300 
ccaccccgtc ggccgggatg ttcagcgctg gtacacagcc ccgactaggg gacctgcggc 360 
gaactgtcgc cgctcggcac ttcgtggtct gccaccagcc gaccgcactt actgttttga 420 
tggctttgcc ggctgccgtt ttgccgccga 450 



<210> 230 
<211> 450 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> 5p Consensus 
<220> 

<221> variation 
<222> () . . (450) 

<223> The nucleotide identity of each n is indicated in 
Figure 9 . 

<400> 230 

nnnncctggc atnactactg cnattgagca ngcngctctn gcngcggcca antcngccnt 6 0 
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ngcgaatgct gtggtngtnn ggccnttnnt ntcncnnnng cannnngaga tnctnatnaa 120 
nntnatgcaa ccncgncagn tngtnttncg nccngaggtn ntntggaanc anccnatnca 180 
ncgngtnatn cataangann tngancnnta ntgncgngcn cgnncnggnc gntgnntnga 240 
nnttggngcn canccnngnt cnatnaanga naanccnaan gtnntncanc gntgnttnct 300 
nnnnccngnn ggncgngatg ttcagcgntg gtanncngcn ccnacnngng gnccngcngc 360 
naantgncgn ngntcngcnn tncgnggnct nccnncngcn gaccgcactt actgnntnga 420 
nggnttnncn ngntgnnnnt ttncngcnga 4 50 

<210> 231 
<211> 300 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> 3p.pile {hpea} shown in Figure 9B 
<400> 231 

actgagtcag tgaagccagt gcttgacctg acaaattcaa ttctgtgtcg ggtggaatga 6 0 
ataacatgtc ttttgctgcg cccatgggtt cgcgaccatg cgccctcggc ctattttgct 120 
gttgctcctc atgtttctgc ctatgctgcc cgcgccaccg cccggtcagc cgtctggccg 180 
ccgtcgtggg cggcgcagcg gcggttccgg cggtggtttc tggggtgacc gggttgattc 240 
tcagcccttc gcaatcccct atattcatcc aaccaacccc ttcgcccccg atgtcaccgc 3 00 

<210> 232 
<211> 300 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> 3p.pile {hpeuigh} shown in Figure 9B 
<400> 232 

actgagtcgg tgaagccagt gctcgacttg acaaattcaa tcctgtgtcg ggtggaatga 6 0 
ataacatgtc ttttgctgcg cccatgggtt ggcgaccatg cgccctcggc ctattttgct 12 0 
gttgctcctc atgtttctgc ctatcgtgcc cgcgccaccg cccggtcagc cgtctggccg 180 
ccgtcgtggg cggcgcagcg gcggttccgg cggtggtttc tggggtgacc gggttgattc 24 0 
tcagcccttc gcaatcccct atattcatcc aaccaacccc ttcgcccccg atgtcaccgc 300 



<210> 233 
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<211> 300 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> 3p.pile {hpesvp} shown in Figure 9B 
<400> 233 

actgagtcag taaaaccagt gctcgacttg acaaattcaa tcttgtgtcg ggtggaatga 60 
ataacatgtc ttttgctgcg cccatgggtt cgcgaccatg cgccctcggc ctattttgtt 12 0 
gctgctcctc atgtttttgc ctatgctgcc cgcgccaccg cccggtcagc cgtctggccg 18 0 
ccgtcgtggg cggcgcagcg gcggttccgg cggtggtttc tggggtgacc gggttgattc 240 
tcagcccttc gcaatcccct atattcatcc aaccaacccc ttcgcccccg atgtcaccgc 300 

<210> 234 
<211> 300 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<22 3> 3p.pile {hpenssp} shown in Figure 9B 
<400> 234 

acagagtctg ttaagcctat acttgacctt acacactcaa ttatgcaccg gtctgaatga 60 
ataacatgtg gtttgctgcg cccatgggtt cgccaccatg cgccctaggc ctcttttgct 12 0 
gttgttcctc ttgtttctgc ctatgttgcc cgcgccaccg accggtcagc cgtctggccg 18 0 
ccgtcgtggg cggcgcagcg gcggtaccgg cggtggtttc tggggtgacc gggttgattc 24 0 
tcagcccttc gcaatcccct atattcatcc aaccaacccc tttgccccag acgttgccgc 300 

<210> 235 
<211> 297 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> 3p.pile {840453p} shown in Figure 9B 
<400> 235 

acagagacta ttaaacctgt acttgatctc acaaattcca tcatacagcg ggtggaatga 60 
ataacatgtc ttttgcatcg cccatgggat caccatgcgc cctagggctg ttctgttgtt 120 
gttcctcatg tttctgccta tgctgcccgc gccaccggcc ggtcagccgt ctggccgtcg 18 0 
ccgtgggcgg cgcagcggcg gtgccggcgg tggtttctgg agtgacaggg ttgattctca 240 
gcccttcgcc ctcccctata ttcatccaac caaccccttc gccgccgatg tcgtttc 297 
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<210> 236 
<211> 300 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<22 3> 3p Consensus shown in Figure 9B 
<220> 

<221> variation 
<222> (1) . . (300) 

<223> The nucleotide identity of each n is indicated in 
Figure 9B 

<400> 236 

acngagncnn tnaanccnnt nctnganntn acanantcna tnntnnnncg gnnngaatga 6 0 
ataacatgtn ntttgcnncg cccatgggnt nnnnaccatg cgccctnggn ctnttntgnt 120 
gntgntcctc ntgtttntgc ctatnntgcc cgcgccaccg nccggtcagc cgtctggccg 180 
ncgncgtggg cggcgcagcg gcggtnccgg cggtggtttc tggngtgacn gggttgattc 24 0 
tcagcccttc gcnntcccct atattcatcc aaccaacccc ttngccncng angtnnnnnc 300 



<210> 237 
<211> 250 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> 3p.pile {hpea} shown in Figure 9C 
<400> 237 

agcgcttacc ctgtttaacc ttgctgacac cctgcttggc ggtctaccga cagaattgat 6 0 
ttcgtcggct ggtggccagc tgttctactc tcgccccgtc gtctcagcca atggcgagcc 12 0 
gactgttaag ctgtatacat ctgtggagaa tgctcagcag gataagggta ttgcaatccc 180 
gcatgacatc gacctcgggg aatcccgtgt agttattcag gattatgaca accaacatga 240 
gcaggaccga 250 



<210> 238 
<211> 250 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> 3p.pile {hpeuigh} shown in Figure 9C 



<400> 238 
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agcgcttacc ctgtttaacc ttgctgacac cctgcttggc ggtctaccga cagaattgat 60 
ttcgtcggct ggtggccagc tgttctactc tcgccccgtc gtctcagcca atggcgagcc 12 0 
gactgttaag ctgtatacat ctgtagagaa tgctcagcag gataagggta ttgcaatccc 18 0 
gcatgacatc gacctcgggg aatctcgagt tgttattcag gattatgaca accaacatga 240 
gcaggaccgg 250 

<210> 239 
<211> 250 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<22 3> 3p.pile {hpesvp} shown in Figure 9C 
<400> 239 

agccctcacc ctgttcaacc ttgctgacac tctgcttggc ggcctgccga cagaattgat 60 
ttcgtcggct ggtggccagc tgttctactc ccgtcccgtt gtctcagcca atggcgagcc 120 
gactgttaag ttgtatacat ctgtagagaa tgctcagcag gataagggta ttgcaatccc 180 
gcatgacatt gacctcggag aatctcgtgt ggttattcag gattatgata accaacatga 240 
acaagatcgg 250 

<210> 240 
<211> 250 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> 3p.pile {hpenssp} shown in Figure 9C 
<400> 240 

agctctaaca ttacttaacc ttgctgacac gctcctcggc gggctcccga cagaattaat 60 
ttcgtcggct ggcgggcaac tgttttattc ccgcccggtt gtctcagcca atggcgagcc 12 0 
aaccgtgaag ctctatacat cagtggagaa tgctcagcag gataagggtg ttgctatccc 180 
ccacgatatc gatcttggtg attcgcgtgt ggtcattcag gattatgaca accagcatga 240 
gcaggatcgg 2 50 



<210> 241 
<211> 250 
<212> DNA 

<213> Hepatitis E virus 
<220> 
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<223> 3p.pile {840453p} shown in Figure 9C 
<400> 241 

tgccctgact ctgtttaatc ttgctgatac gcttcttggt ggtttaccga cagaattgat 60 
ttcgtcggct gggggtcaac tgttttactc ccgccctgtt cagaattgat ttcgtcggct 12 0 
gggggtcaac tgttttactc ccgccctgtt tgcgcagcaa gacaagggca tcaccattcc 18 0 
acacgacata gatttaggtg actcccgtgt ggttatccag gattatgata accagcacga 24 0 
acaagatcga 250 

<210> 242 
<211> 250 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> 3p Consensus shown in Figure 9C 
<220> 

<221> variation 
<222> {) . . (250) 

<223> The nucleotide identity of each n is indicated in 
Figure 9C 

<400> 242 

ngcnctnacn ntnntnaanc ttgctganac nctnctnggn ggnntnccga cagaattnat 6 0 

ttcgtcggct ggnggncanc tgttntantc ncgnccngtn gtctcngcca atggcgagcc 120 

nacngtnaag ntntanacat cngtngagaa tgcncagcan ganaagggnn tnncnatncc 180 

ncanganatn ganntnggng antcncgngt ngtnatncag gattatgana accancanga 24 0 



ncangancgn 



<210> 243 
<211> 418 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> Aulol-wlabolpl .pat 

<220> 

<221> CDS 

<222> (3) . . (416) 



250 



<400> 243 

ct ggc aty act act gey att gag caa get get ctg get gcg gee aat 4 7 

Gly Xaa Thr Thr Xaa lie Glu Gin Ala Ala Leu Ala Ala Ala Asn 
15 10 15 
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tct gcc ttg gcg aat get gtg gtg gtt egg ccg ttt tta tec cgt gtg 
Ser Ala Leu Ala Asn Ala Val Val Val Arg Pro Phe Leu Ser Arg Val 
20 25 30 



95 



cag act gag ate ctt att aac ttg atg caa cct egg cag ctg gtg ttc 14 3 
Gin Thr Glu lie Leu lie Asn Leu Met Gin Pro Arg Gin Leu Val Phe 
35 40 45 

cga cct gag gtg ctt tgg aat cat ccc att cag egg gtt ate cat aat 191 
Arg Pro Glu Val Leu Trp Asn His Pro lie Gin Arg Val lie His Asn 
50 55 60 

gag tta gaa caa tac tgc egg gcc egg gcc ggc cgt tgc eta gag gtg 23 9 
Glu Leu Glu Gin Tyr Cys Arg Ala Arg Ala Gly Arg Cys Leu Glu Val 
65 70 75 

ggg gcc cac cca agg tec att aac gat aac ccc aat gtt ttg cac egg 287 
Gly Ala His Pro Arg Ser lie Asn Asp Asn Pro Asn Val Leu His Arg 
80 85 90 95 

tgt ttt ctg cga ccg gtc ggg agg gat gtt cag cgc tgg tac tct gcc 33 5 
Cys Phe Leu Arg Pro Val Gly Arg Asp Val Gin Arg Trp Tyr Ser Ala 
100 105 110 

ccc acc cgc ggc cct gcg get aac tgc cgc cgc tec get ttg cgt ggc 383 
Pro Thr Arg Gly Pro Ala Ala Asn Cys Arg Arg Ser Ala Leu Arg Gly 
115 120 125 

ctt ccc ccc gtc gac cgc act tac tgt yty gat gg 418 
Leu Pro Pro Val Asp Arg Thr Tyr Cys Xaa Asp 
130 135 



<210> 244 
<211> 138 
<212> PRT 

<213> Hepatitis E virus 
<400> 244 

Gly Xaa Thr Thr Xaa lie Glu Gin Ala Ala Leu Ala Ala Ala Asn Ser 
15 10 15 

Ala Leu Ala Asn Ala Val Val Val Arg Pro Phe Leu Ser Arg Val Gin 
20 25 30 

Thr Glu lie Leu lie Asn Leu Met Gin Pro Arg Gin Leu Val Phe Arg 
35 40 45 

Pro Glu Val Leu Trp Asn His Pro lie Gin Arg Val lie His Asn Glu 
50 55 60 

Leu Glu Gin Tyr Cys Arg Ala Arg Ala Gly Arg Cys Leu Glu Val Gly 
65 70 75 80 



Ala His Pro Arg Ser lie Asn Asp Asn Pro Asn Val Leu His Arg Cys 
85 90 95 
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Phe Leu Arg Pro Val Gly Arg Asp 
100 

Thr Arg Gly Pro Ala Ala Asn Cys 
115 120 

Pro Pro Val Asp Arg Thr Tyr Cys 
130 135 



Val Gin Arg Trp Tyr Ser Ala Pro 
105 110 

Arg Arg Ser Ala Leu Arg Gly Leu 
125 

Xaa Asp 



<210> 245 
<211> 197 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> Aulo2-wlao2 .pat 

<220> 

<221> CDS 

<222> (2) . . (196) 

<400> 245 

g aca gaa ttr att teg teg get ggg gga cag tta ttc tac tec cgc cct 49 

Thr Glu Xaa lie Ser Ser Ala Gly Gly Gin Leu Phe Tyr Ser Arg Pro 

15 10 15 

gty gtc tea gee aat ggc gag ccg act gtt aaa tta tat aca tct gta 9 7 
Xaa Val Ser Ala Asn Gly Glu Pro Thr Val Lys Leu Tyr Thr Ser Val 
20 25 30 

gag aat gcg cag cag gac aag ggg att gee ate cca cat gat ata gat 14 5 
Glu Asn Ala Gin Gin Asp Lys Gly lie Ala lie Pro His Asp lie Asp 
35 40 45 

ctg ggc gac tct cgt gtg gtg ate cag gat tat gay aac car cay gaa 193 
Leu Gly Asp Ser Arg Val Val lie Gin Asp Tyr Xaa Asn Xaa Xaa Glu 
50 55 60 

caa g 197 
Gin 
65 



<210> 246 
<211> 65 
<212> PRT 

<213> Hepatitis E virus 
<400> 246 

Thr Glu Xaa lie Ser Ser Ala Gly Gly Gin Leu Phe Tyr Ser Arg Pro 
15 10 15 

Xaa Val Ser Ala Asn Gly Glu Pro Thr Val Lys Leu Tyr Thr Ser Val 
20 25 30 

Glu Asn Ala Gin Gin Asp Lys Gly lie Ala lie Pro His Asp lie Asp 



135/140 



35 40 45 

Leu Gly Asp Ser Arg Val Val lie Gin Asp Tyr Xaa Asn Xaa Xaa Glu 
50 55 60 

Gin 
65 



<210> 247 
<211> 418 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> Arlol- f73olpl.pat 

<220> 

<221> CDS 

<222> (3) . . (416) 

<400> 247 

ct ggc aty act act gey att gag caa get get ctg get gcg gee aac 4 7 

Gly Xaa Thr Thr Xaa lie Glu Gin Ala Ala Leu Ala Ala Ala Asn 
15 10 15 

tct gec ttg gcg aat get gtg gtg gtt egg ccg ttt tta tec cgt gtg 95 
Ser Ala Leu Ala Asn Ala Val Val Val Arg Pro Phe Leu Ser Arg Val 
20 25 30 

cag acc gag att ctt att aac eta atg caa ccc egg cag ctg gtt ttt 143 
Gin Thr Glu lie Leu lie Asn Leu Met Gin Pro Arg Gin Leu Val Phe 
35 40 45 

cgt cct gag gtg ctt tgg aac cat cct ate cag egg gtt att cat aat 191 
Arg Pro Glu Val Leu Trp Asn His Pro lie Gin Arg Val lie His Asn 
50 55 60 

gag tta gaa cag tac tgt egg get egg get ggt cgc tgc eta gag gtc 23 9 
Glu Leu Glu Gin Tyr Cys Arg Ala Arg Ala Gly Arg Cys Leu Glu Val 
65 70 75 

ggg gec cac cca agg tec att aat gat aac cct aat gtt ttg cac egg 287 
Gly Ala His Pro Arg Ser lie Asn Asp Asn Pro Asn Val Leu His Arg 
80 85 90 95 

tgc ttc eta cga cca gtc ggg agg gat gtt caa cgt tgg tat tec gec 33 5 
Cys Phe Leu Arg Pro Val Gly Arg Asp Val Gin Arg Trp Tyr Ser Ala 
100 105 110 

ccc acc cgc ggt cct get gec aac tgc cgc cgt tec get ctg cgc ggc 383 
Pro Thr Arg Gly Pro Ala Ala Asn Cys Arg Arg Ser Ala Leu Arg Gly 
115 120 125 

etc cct ccc gtc gac cgc act tac tgt yty gat gg 418 
Leu Pro Pro Val Asp Arg Thr Tyr Cys Xaa Asp 
130 135 
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<210> 248 
<211> 138 
<212> PRT 
<213> Hepatitis 

<400> 248 
Gly Xaa Thr Thr 
1 

Ala Leu Ala Asn 
20 

Thr Glu lie Leu 
35 

Pro Glu Val Leu 
50 

Leu Glu Gin Tyr 
65 

Ala His Pro Arg 



Phe Leu Arg Pro 
100 

Thr Arg Gly Pro 
115 

Pro Pro Val Asp 
130 



E virus 



Xaa lie Glu Gin 
5 

Ala Val Val Val 



lie Asn Leu Met 
40 

Trp Asn His Pro 
55 

Cys Arg Ala Arg 
70 

Ser lie Asn Asp 
85 

Val Gly Arg Asp 



Ala Ala Asn Cys 
120 

Arg Thr Tyr Cys 
135 



Ala Ala Leu Ala 
10 

Arg Pro Phe Leu 
25 

Gin Pro Arg Gin 



lie Gin Arg Val 
60 

Ala Gly Arg Cys 
75 

Asn Pro Asn Val 
90 

Val Gin Arg Trp 
105 

Arg Arg Ser Ala 



Xaa Asp 



Ala Ala Asn Ser 
15 

Ser Arg Val Gin 
30 

Leu Val Phe Arg 
45 

lie His Asn Glu 



Leu Glu Val Gly 
80 

Leu His Arg Cys 
95 

Tyr Ser Ala Pro 
110 

Leu Arg Gly Leu 
125 



<210> 249 
<211> 145 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> Arl-f 73o2p2 .pat 

<220> 

<221> CDS 

<222> (1) . . (144) 

<400> 249 

gty gtc tcr gcc aat ggc gag ccg 
Xaa Val Xaa Ala Asn Gly Glu Pro 
1 5 



act gtt aag eta tat aca tct gta 
Thr Val Lys Leu Tyr Thr Ser Val 
10 15 



gag aac gcg cag cag gat aaa ggg ate gcc att cca cac gat ata gat 
Glu Asn Ala Gin Gin Asp Lys Gly lie Ala lie Pro His Asp lie Asp 
20 25 30 
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ctg ggc gat tec cgt gtg gtc att cag gat tat gay aac car cay gaa c 14 5 
Leu Gly Asp Ser Arg Val Val lie Gin Asp Tyr Xaa Asn Xaa Xaa Glu 
35 40 45 



<210> 250 
<211> 48 
<212> PRT 

<213> Hepatitis E virus 
<400> 250 

Xaa Val Xaa Ala Asn Gly Glu Pro Thr Val Lys Leu Tyr Thr Ser Val 
15 10 15 

Glu Asn Ala Gin Gin Asp Lys Gly lie Ala lie Pro His Asp lie Asp 
20 25 30 

Leu Gly Asp Ser Arg Val Val lie Gin Asp Tyr Xaa Asn Xaa Xaa Glu 
35 40 45 



<210> 251 
<211> 418 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> Ar2ol-f 77olpl .pat 

<220> 

<221> CDS 

<222> (3) . . (416) 

<400> 251 

ct ggc aty act act gey att gag caa get get ctg get gcg get aac 47 
Gly Xaa Thr Thr Xaa lie Glu Gin Ala Ala Leu Ala Ala Ala Asn 
15 10 15 

tct gee ttg gcg aat get gtg gtg gtt egg ccg ttt eta tec cgt gtg 95 
Ser Ala Leu Ala Asn Ala Val Val Val Arg Pro Phe Leu Ser Arg Val 
20 25 30 

cag act gag ate ctt att aac tta atg car ccc egg car ctg gtt ttc 143 
Gin Thr Glu lie Leu lie Asn Leu Met Xaa Pro Arg Xaa Leu Val Phe 
35 40 45 

cgt ccc gag gtg ctt tgg aat cat ccc att caa egg gtt att cat aat 191 
Arg Pro Glu Val Leu Trp Asn His Pro lie Gin Arg Val lie His Asn 
50 55 60 

gaa tta gag cag tac tgc egg ace egg get ggc cgt tgt tta gag gtc 2 39 
Glu Leu Glu Gin Tyr Cys Arg Thr Arg Ala Gly Arg Cys Leu Glu Val 
65 70 75 

gga gec cat cca agg tec att aat gac aac cct aac gtt cyg cac egg 2 87 
Gly Ala His Pro Arg Ser lie Asn Asp Asn Pro Asn Val Xaa His Arg 
80 85 90 95 
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tgc ttc tta cga cca gtc ggg agg 
Cys Phe Leu Arg Pro Val Gly Arg 
100 

ccc act cgc ggc cct gcg get aat 
Pro Thr Arg Gly Pro Ala Ala Asn 
115 

etc cct cct gtc gac cgc act tac 
Leu Pro Pro Val Asp Arg Thr Tyr 
130 135 



gat gtc caa cga tgg tac tea gec 33 5 
Asp Val Gin Arg Trp Tyr Ser Ala 
105 110 

tgc cgt cgt tec get ttg cgt ggt 3 83 
Cys Arg Arg Ser Ala Leu Arg Gly 
120 125 

tgt yty gat gg 418 
Cys Xaa Asp 



<210> 252 
<211> 138 
<212> PRT 

<213> Hepatitis E virus 
<400> 252 

Gly Xaa Thr Thr Xaa lie Glu Gin Ala Ala Leu Ala Ala Ala Asn Ser 
15 10 15 

Ala Leu Ala Asn Ala Val Val Val Arg Pro Phe Leu Ser Arg Val Gin 
20 25 30 

Thr Glu lie Leu lie Asn Leu Met Xaa Pro Arg Xaa Leu Val Phe Arg 
35 40 45 

Pro Glu Val Leu Trp Asn His Pro lie Gin Arg Val lie His Asn Glu 
50 55 60 

Leu Glu Gin Tyr Cys Arg Thr Arg Ala Gly Arg Cys Leu Glu Val Gly 
65 70 75 80 

Ala His Pro Arg Ser lie Asn Asp Asn Pro Asn Val Xaa His Arg Cys 
85 90 95 

Phe Leu Arg Pro Val Gly Arg Asp Val Gin Arg Trp Tyr Ser Ala Pro 
100 105 110 

Thr Arg Gly Pro Ala Ala Asn Cys Arg Arg Ser Ala Leu Arg Gly Leu 
115 120 125 

Pro Pro Val Asp Arg Thr Tyr Cys Xaa Asp 
130 135 



<210> 253 
<211> 197 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> Ar2o2-f 7702 .pat 
<220> 
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<221> CDS 

<222> (2) . . (196) 

<400> 253 

g aca gaa ttr att teg teg get ggg ggt cag ttg ttt tac tec cgc cct 49 
Thr Glu Xaa lie Ser Ser Ala Gly Gly Gin Leu Phe Tyr Ser Arg Pro 
15 10 15 

gtc gtc tea gec aat ggc gag ccg act gtt aag ttg tat aca tct gtg 97 
Val Val Ser Ala Asn Gly Glu Pro Thr Val Lys Leu Tyr Thr Ser Val 
20 25 30 

gag aat gcg cag cag gat aaa gga ate gee ate cca cac gac ata gat 145 
Glu Asn Ala Gin Gin Asp Lys Gly lie Ala lie Pro His Asp lie Asp 
35 40 45 

ctg ggc gat tec cgt gtg gtt att cag gat tat gay aac car cay gaa 193 
Leu Gly Asp Ser Arg Val Val He Gin Asp Tyr Xaa Asn Xaa Xaa Glu 
50 55 60 

caa g 197 
Gin 
65 



=j* <210> 254 

Sj <211> 65 

s <212> PRT 

Ljl <213> Hepatitis E virus 

"Jfl <400> 254 

l y ' Thr Glu Xaa He Ser Ser Ala Gly Gly Gin Leu Phe Tyr Ser Arg Pro 

y3 Val Val Ser Ala Asn Gly Glu Pro Thr Val Lys Leu Tyr Thr Ser Val 

20 25 30 

Glu Asn Ala Gin Gin Asp Lys Gly He Ala He Pro His Asp He Asp 
35 40 45 

Leu Gly Asp Ser Arg Val Val He Gin Asp Tyr Xaa Asn Xaa Xaa Glu 
50 55 60 



Gin 
65 



<210> 255 
<211> 23 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> HEVConsORF IN-al 



<400> 255 

ccrtcrarrc artaggtgcg gtc 



23 
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<210> 256 
<211> 25 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> HEVConsORF 2N-al 
<400> 256 

cytgytcrtg ytggttrtca taatc 25 



<210> 257 
<211> 21 
<212> DNA 

<213> Hepatitis E virus 

O <220> 

<223> HEVConsORF 1N-S2 

in 

Qt <400> 257 

iy. cygccytkgc gaatgctgtg g 21 



<210> 258 
<211> 25 
<212> DNA 

<213> Hepatitis E virus 
<220> 

<223> HEVConsORF 2N-a2 



<400> 258 

gytcrtgytg rttrtcataa tcctg 



25 



